Swift 类似HandyJSON解析Struct
- HandyJSON
- 从源码解析Struct
- 获取TargetStructMetadata
- 获取TargetStructDescriptor
- 实现TargetRelativeDirectPointer
- FieldDescriptor和FieldRecord
- fieldOffsetVectorOffset计算偏移量
- 代码的验证
HandyJSON
HandyJSON是阿里开发的一个在swift上把JSON数据转化为对应model的框架。与其他流行的Swift JSON库相比,HandyJSON的特点是,它支持纯swift类,使用也简单。它反序列化时(把JSON转换为Model)不要求Model从NSObject继承(因为它不是基于KVC机制),也不要求你为Model定义一个Mapping函数。只要你定义好Model类,声明它服从HandyJSON协议,HandyJSON就能自行以各个属性的属性名为Key,从JSON串中解析值。不过因为HandyJSON是基于swift的metadata来做的,如果swift的metadata的结构改了,HandyJSON可能就直接不能用了。当然阿里一直在维护这个框架,swift的源码有变化,相信框架也是相对于有改变的。
HandyJSON的github
从源码解析Struct
获取TargetStructMetadata
由于HandyJSON是基于swift的metadata来做的,说道解析解析struct,那就不得不去了解metadata。接下来,我们会从源码的角度去寻找metadata。
首先,我们从源码Metadata.h中搜索StructMetadata相关信息,会发现其真正类型是TargetStructMetadata。
using StructMetadata = TargetStructMetadata<InProcess>;
接着,我们查看TargetStructMetadata的结构会发现,TargetStructMetadata继承自TargetValueMetadata,TargetValueMetadata继承自TargetMetadata。
struct TargetStructMetadata : public TargetValueMetadata<Runtime> {
struct TargetValueMetadata : public TargetMetadata<Runtime> {
那么,我们就可以通过这个继承链去还原TargetStructMetadata的结构。
从代码中我们可以看出,TargetStructMetadata的第一个属性是Kind,除了这个属性还有一个description,用于记录描述文件。
struct TargetMetadata {......private:/// The kind. Only valid for non-class metadata; getKind() must be used to get/// the kind value.StoredPointer Kind;......
}struct TargetValueMetadata : public TargetMetadata<Runtime> {using StoredPointer = typename Runtime::StoredPointer;TargetValueMetadata(MetadataKind Kind,const TargetTypeContextDescriptor<Runtime> *description): TargetMetadata<Runtime>(Kind), Description(description) {}//用于记录元数据的描述/// An out-of-line description of the type.TargetSignedPointer<Runtime, const TargetValueTypeDescriptor<Runtime> * __ptrauth_swift_type_descriptor> Description;......
}
这样我们就可以得到TargetStructMetadata的结构为
struct TargetStructMetadata {// StoredPointer Kind; 64位系统下 using StoredPointer = uint64_t; 即为Intvar kind: Int //暂且先定义为UnsafeMutablePointer,后面会分析typeDescriptor的结构 T就是泛型var typeDescriptor: UnsafeMutablePointer<T>
}
获取TargetStructDescriptor
接下来我们解析Description的相关信息。从源码中可得TargetStructDescriptor是Description的结构。
const TargetStructDescriptor<Runtime> *getDescription() const {return llvm::cast<TargetStructDescriptor<Runtime>>(this->Description);}
我们查找TargetStructDescriptor可以得到,其继承自TargetValueTypeDescriptor,含有两个属性NumFields(记录属性的count)和FieldOffsetVectorOffset(记录属性在metadata中的偏移量)
class TargetStructDescriptor final: public TargetValueTypeDescriptor<Runtime>,public TrailingGenericContextObjects<TargetStructDescriptor<Runtime>,TargetTypeGenericContextDescriptorHeader,/*additional trailing objects*/TargetForeignMetadataInitialization<Runtime>,TargetSingletonMetadataInitialization<Runtime>,TargetCanonicalSpecializedMetadatasListCount<Runtime>,TargetCanonicalSpecializedMetadatasListEntry<Runtime>,TargetCanonicalSpecializedMetadatasCachingOnceToken<Runtime>> {....../// The number of stored properties in the struct./// If there is a field offset vector, this is its length.uint32_t NumFields; //记录属性的count/// The offset of the field offset vector for this struct's stored/// properties in its metadata, if any. 0 means there is no field offset/// vector.uint32_t FieldOffsetVectorOffset; //记录属性在metadata中的偏移量
TargetValueTypeDescriptor继承自TargetTypeContextDescriptor,TargetTypeContextDescriptor含有三个属性:Name(类型的名称)、AccessFunctionPtr(指向此类型的元数据访问函数的指针)和Fields(指向类型的字段描述符的指针)。
class TargetValueTypeDescriptor: public TargetTypeContextDescriptor<Runtime> {
public:static bool classof(const TargetContextDescriptor<Runtime> *cd) {return cd->getKind() == ContextDescriptorKind::Struct ||cd->getKind() == ContextDescriptorKind::Enum;}
};
class TargetTypeContextDescriptor: public TargetContextDescriptor<Runtime> {
public:/// The name of the type.// 类型的名称TargetRelativeDirectPointer<Runtime, const char, /*nullable*/ false> Name;/// A pointer to the metadata access function for this type.////// The function type here is a stand-in. You should use getAccessFunction()/// to wrap the function pointer in an accessor that uses the proper calling/// convention for a given number of arguments.// 指向此类型的元数据访问函数的指针TargetRelativeDirectPointer<Runtime, MetadataResponse(...),/*Nullable*/ true> AccessFunctionPtr;/// A pointer to the field descriptor for the type, if any.// 指向类型的字段描述符的指针TargetRelativeDirectPointer<Runtime, const reflection::FieldDescriptor,/*nullable*/ true> Fields;......
}
TargetTypeContextDescriptor又继承自基类TargetContextDescriptor,TargetContextDescriptor包含两个属性:Flags(用于表示描述context的标志,包含kind和version)和Parent(用于表示父类的context,如果是在顶层,则表示没有父类,则为NULL)。
/// Base class for all context descriptors.
template<typename Runtime>
struct TargetContextDescriptor {/// Flags describing the context, including its kind and format version.// 用于表示描述context的标志,包含kind和versionContextDescriptorFlags Flags;/// The parent context, or null if this is a top-level context.// 用于表示父类的context,如果是在顶层,则表示没有父类,则为NULLTargetRelativeContextPointer<Runtime> Parent;......
}
从这里开始,TargetStructDescriptor就已经明了了,我们就可以写出TargetStructDescriptor的相关结构,同时修正TargetStructMetadata中的泛型T。
struct TargetStructMetadata {var kind: Intvar typeDescriptor: UnsafeMutablePointer<TargetStructDescriptor>
}struct TargetStructDescriptor {// 用于表示描述context的标志,包含kind和versionvar flags: Int32 // ContextDescriptorFlags Int32// 用于表示父类的context,如果是在顶层,则表示没有父类,则为NULLvar parent: TargetRelativeContextPointer<UnsafeRawPointer> // Relative 相对地址// 类型的名称var name: TargetRelativeDirectPointer<CChar> // Relative 相对地址// 指向此类型的元数据访问函数的指针var accessFunctionPointer: TargetRelativeDirectPointer<UnsafeRawPointer> // Relative 相对地址// 指向类型的字段描述符的指针var fieldDescriptor: TargetRelativeDirectPointer<FieldDescriptor> // Relative 相对地址// 记录属性的countvar numFields: Int32// 记录属性在metadata中的偏移量var fieldOffsetVectorOffset: Int32
}// 下面是一些属性的类型解析
/// Common flags stored in the first 32-bit word of any context descriptor.
// flags 就是 Int32
struct ContextDescriptorFlags {private:uint32_t Value;
}
实现TargetRelativeDirectPointer
对于相对地址TargetRelativeDirectPointer,我们从源码中搜索TargetRelativeDirectPointer可得出TargetRelativeDirectPointer就是RelativeDirectPointer。
template <typename Runtime, typename Pointee, bool Nullable = true>
using TargetRelativeDirectPointer= typename Runtime::template RelativeDirectPointer<Pointee, Nullable>;
接着在RelativePointer.h找到RelativeDirectPointer,发现RelativeDirectPointer继承自基类RelativeDirectPointerImpl,其包含一个属性RelativeOffset(偏移量)。并且其含有通过偏移量获取真实内存的方法。
template <typename T, bool Nullable = true, typename Offset = int32_t,typename = void>
class RelativeDirectPointer;/// A direct relative reference to an object that is not a function pointer.
// offset传入Int32
template <typename T, bool Nullable, typename Offset>
class RelativeDirectPointer<T, Nullable, Offset,typename std::enable_if<!std::is_function<T>::value>::type>: private RelativeDirectPointerImpl<T, Nullable, Offset>
{......
}/// A relative reference to a function, intended to reference private metadata
/// functions for the current executable or dynamic library image from
/// position-independent constant data.
template<typename T, bool Nullable, typename Offset>
class RelativeDirectPointerImpl {private:/// The relative offset of the function's entry point from *this.Offset RelativeOffset;......// 通过偏移量计算 同时还返回泛型T类型PointerTy get() const & {// Check for null.if (Nullable && RelativeOffset == 0)return nullptr;// The value is addressed relative to `this`.uintptr_t absolute = detail::applyRelativeOffset(this, RelativeOffset);return reinterpret_cast<PointerTy>(absolute);}......
}/// Apply a relative offset to a base pointer. The offset is applied to the base
/// pointer using sign-extended, wrapping arithmetic.
// 通过偏移量计算
template<typename BasePtrTy, typename Offset>
static inline uintptr_t applyRelativeOffset(BasePtrTy *basePtr, Offset offset) {static_assert(std::is_integral<Offset>::value &&std::is_signed<Offset>::value,"offset type should be signed integer");auto base = reinterpret_cast<uintptr_t>(basePtr);// We want to do wrapping arithmetic, but with a sign-extended// offset. To do this in C, we need to do signed promotion to get// the sign extension, but we need to perform arithmetic on unsigned values,// since signed overflow is undefined behavior.auto extendOffset = (uintptr_t)(intptr_t)offset;// 指针地址+存放的offset(偏移地址) -- 内存平移获取值return base + extendOffset;
}
那么我们就可以TargetRelativeDirectPointer的结构:
// 传入泛型Pointee
struct TargetRelativeDirectPointer<Pointee> {var offset: Int32// 通过偏移量计算内存mutating func getmeasureRelativeOffset() -> UnsafeMutablePointer<Pointee> {let offset = self.offsetreturn withUnsafePointer(to: &self) { p in// 使用advanced偏移offset,再重新绑定成Pointee类型return UnsafeMutablePointer(mutating: UnsafeRawPointer(p).advanced(by: numericCast(offset)).assumingMemoryBound(to: Pointee.self))}}
}
同时我们就可以修正TargetStructDescriptor为:
struct TargetStructDescriptor {// 用于表示描述context的标志,包含kind和versionvar flags: Int32// 用于表示父类的context,如果是在顶层,则表示没有父类,则为NULLvar parent: Int32// 由于不去解析,暂时定义为Int32// 类型的名称var name: TargetRelativeDirectPointer<CChar>// 指向此类型的元数据访问函数的指针var accessFunctionPointer: TargetRelativeDirectPointer<UnsafeRawPointer>// 指向类型的字段描述符的指针var fieldDescriptor: TargetRelativeDirectPointer<FieldDescriptor>// 记录属性的countvar numFields: Int32// 记录属性在metadata中的偏移量var fieldOffsetVectorOffset: Int32
}// TargetRelativeContextPointer暂时不解析,通过源码分析可得暂时解析为Int32
template<typename Runtime,template<typename _Runtime> class Context = TargetContextDescriptor>
using TargetRelativeContextPointer =RelativeIndirectablePointer<const Context<Runtime>,/*nullable*/ true, int32_t,TargetSignedContextPointer<Runtime, Context>>;
FieldDescriptor和FieldRecord
再下一步,我们开始解析FieldDescriptor,源码中FieldDescriptor如下:
// Field descriptors contain a collection of field records for a single
// class, struct or enum declaration.
class FieldDescriptor {const FieldRecord *getFieldRecordBuffer() const {return reinterpret_cast<const FieldRecord *>(this + 1);}public:const RelativeDirectPointer<const char> MangledTypeName;const RelativeDirectPointer<const char> Superclass;FieldDescriptor() = delete;const FieldDescriptorKind Kind;const uint16_t FieldRecordSize;const uint32_t NumFields;......// 获取所有属性,每个属性用FieldRecord封装llvm::ArrayRef<FieldRecord> getFields() const {return {getFieldRecordBuffer(), NumFields};}......
}// FieldDescriptorKin就是 Int16
enum class FieldDescriptorKind : uint16_t {......
}
FieldRecord在源码中的结构为:
class FieldRecord {const FieldRecordFlags Flags;public:const RelativeDirectPointer<const char> MangledTypeName;const RelativeDirectPointer<const char> FieldName;......
}// Field records describe the type of a single stored property or case member
// of a class, struct or enum.
// FieldRecordFlags 就是Int32
class FieldRecordFlags {using int_type = uint32_t;......
}
fieldOffsetVectorOffset计算偏移量
最后还有fieldOffsetVectorOffset(记录属性在metadata中的偏移量)的计算,来获取属性再metadata中的偏移量。源码中能得到的资料是:
// StoredPointer 是Int32 即会返回一个Int32/// Get a pointer to the field offset vector, if present, or null.const StoredPointer *getFieldOffsets() const {assert(isTypeMetadata());auto offset = getDescription()->getFieldOffsetVectorOffset();if (offset == 0)return nullptr;auto asWords = reinterpret_cast<const void * const*>(this);return reinterpret_cast<const StoredPointer *>(asWords + offset);}
但是以这个逻辑去处理,获取的数据是不对的,所以我从HandyJSON的源码中找到了这个:
// 当时64位是 offset 会乘以2
return Int(UnsafePointer<Int32>(pointer)[vectorOffset * (is64BitPlatform ? 2 : 1) + $0])
分析到这里,我们就得到了一个比较清晰地结构线,如下:
// 通过偏移量计算内存地址 传入泛型Pointee
struct TargetRelativeDirectPointer<Pointee> {var offset: Int32// 通过偏移量计算内存mutating func getmeasureRelativeOffset() -> UnsafeMutablePointer<Pointee> {let offset = self.offsetreturn withUnsafePointer(to: &self) { p in// 使用advanced偏移offset,再重新绑定成Pointee类型return UnsafeMutablePointer(mutating: UnsafeRawPointer(p).advanced(by: numericCast(offset)).assumingMemoryBound(to: Pointee.self))}}
}struct TargetStructMetadata {var kind: Intvar typeDescriptor: UnsafeMutablePointer<TargetStructDescriptor>
}struct TargetStructDescriptor {var flags: Int32var parent: Int32var name: TargetRelativeDirectPointer<CChar>var accessFunctionPointer: TargetRelativeDirectPointer<UnsafeRawPointer>var fieldDescriptor: TargetRelativeDirectPointer<FieldDescriptor>var numFields: Int32var fieldOffsetVectorOffset: Int32func getFieldOffsets(_ metadata: UnsafeRawPointer) -> UnsafePointer<Int32> {print(metadata)return metadata.assumingMemoryBound(to: Int32.self).advanced(by: numericCast(self.fieldOffsetVectorOffset) * 2)}// 计算元型时使用var genericArgumentOffset: Int {return 2}
}struct FieldDescriptor {var MangledTypeName: TargetRelativeDirectPointer<CChar>var Superclass: TargetRelativeDirectPointer<CChar>var kind: UInt16var fieldRecordSize: Int16var numFields: Int32var fields: FieldRecordBuffer<FieldRecord>
}struct FieldRecord {var fieldRecordFlags: Int32var mangledTypeName: TargetRelativeDirectPointer<CChar>var fieldName: TargetRelativeDirectPointer<UInt8>
}// 获取FieldRecord
struct FieldRecordBuffer<Element> {var element: Elementmutating func buffer(n: Int) -> UnsafeBufferPointer<Element> {return withUnsafePointer(to: &self) {let ptr = $0.withMemoryRebound(to: Element.self, capacity: 1) { start inreturn start}return UnsafeBufferPointer(start: ptr, count: n)}}mutating func index(of i: Int) -> UnsafeMutablePointer<Element> {return withUnsafePointer(to: &self) {return UnsafeMutablePointer(mutating: UnsafeRawPointer($0).assumingMemoryBound(to: Element.self).advanced(by: i))}}
}
代码的验证
下面我们就代码来验证我们得到的这个结构。
protocol BrigeProtocol {}extension BrigeProtocol {// 通过协议重新绑定类型 返回出去static func get(from pointor: UnsafeRawPointer) -> Any {// Self就是真实的类型pointor.assumingMemoryBound(to: Self.self).pointee}
}struct BrigeMetadataStruct {let type: Any.Typelet witness: Int
}func custom(type: Any.Type) -> BrigeProtocol.Type {let container = BrigeMetadataStruct(type: type, witness: 0)let cast = unsafeBitCast(container, to: BrigeProtocol.Type.self)return cast
}
// LLPerson结构体
struct LLPerson {var age: Int = 18var name: String = "LL"var nameTwo: String = "LLLL"
}
// 创建一个实例
var p = LLPerson()
// LLPerson的metadata按位塞入TargetStructMetadata这个metadata中,LLPerson.self就是UnsafeMutablePointer<TargetStructMetadata>.self
let ptr = unsafeBitCast(LLPerson.self as Any.Type, to: UnsafeMutablePointer<TargetStructMetadata>.self)// 拿到结构体名称
let namePtr = ptr.pointee.typeDescriptor.pointee.name.getmeasureRelativeOffset()
print("当前 struct name: \(String(cString: namePtr))")
// 拿到属性个数
let numFields = ptr.pointee.typeDescriptor.pointee.numFields
print("当前类属性个数: \(numFields)")// 拿到属性再metadata中的偏移量
let offsets = ptr.pointee.typeDescriptor.pointee.getFieldOffsets(UnsafeRawPointer(ptr).assumingMemoryBound(to: Int.self))print("----------- start fetch field -------------")for i in 0..<numFields {// 获取属性名let fieldName = ptr.pointee.typeDescriptor.pointee.fieldDescriptor.getmeasureRelativeOffset().pointee.fields.index(of: Int(i)).pointee.fieldName.getmeasureRelativeOffset()print("----- field \(String(cString: fieldName)) -----")// 拿到属性对应的偏移量 按字节偏移的let fieldOffset = offsets[Int(i)]print("\(String(cString: fieldName)) 的偏移量是:\(fieldOffset)字节")// 这是swift混写过的类型名称 需要把它转成真正的类型名称let typeMangleName = ptr.pointee.typeDescriptor.pointee.fieldDescriptor.getmeasureRelativeOffset().pointee.fields.index(of: Int(i)).pointee.mangledTypeName.getmeasureRelativeOffset()
// print("\(String(cString: typeMangleName))")let genericVector = UnsafeRawPointer(ptr).advanced(by: ptr.pointee.typeDescriptor.pointee.genericArgumentOffset * MemoryLayout<UnsafeRawPointer>.size).assumingMemoryBound(to: Any.Type.self)// 需要用到这个库函数 swift_getTypeByMangledNameInContext 传递四个参数let fieldType = swift_getTypeByMangledNameInContext(typeMangleName, // 混写过后的名称256, // 混写过后的名称信息长度,需要计算 HandyJSON中直接 256UnsafeRawPointer(ptr.pointee.typeDescriptor), // 上下文 typeDescriptor中UnsafeRawPointer(genericVector).assumingMemoryBound(to: Optional<UnsafeRawPointer>.self)) //当前的泛型参数 还原符号信息// 将fieldType按位塞入Anylet type = unsafeBitCast(fieldType, to: Any.Type.self)// 通过协议桥接获取我们的真实类型信息let value = custom(type: type)//获取实例对象p的指针 需要转换成UnsafeRawPointer 并且绑定成1字节即Int8类型,//因为后面是按字节计算偏移量的,不转换,会以结构体的长度偏移let instanceAddress = withUnsafePointer(to: &p){return UnsafeRawPointer($0).assumingMemoryBound(to: Int8.self)}print("fieldTyoe: \(type) \nfieldValue: \(value.get(from: instanceAddress.advanced(by: Int(fieldOffset))))")
}print("----------- end fetch field -------------")
打印信息:

从内存地址我们也可以看出属性的布局信息。















