XML Generation with Typescript. Java-style.
Why?
DAC7 reporting is generating xml files to send them later to tax authorities. XML files are very well structured and described in a detailed fashion by the specification. XML generation is straightforward in Java and .NET world incorporating decades of industry best practices, but it is much less developed in Javascript and Typescript.
In this article we will figure out how to conveniently generate XML files from Typescript definitions in a similar way you would do that in Java. Let's go!
First try. File Templates.
A frugal approach teaches us to try the simplest thing first before introducing additional complexity. Here's a corner-cut: let's prepare some xml templates and just make the regex based replacements with the parameters. Sounds crazy, but it can actually work relatively well in simple cases.
Then you can write a function which interpolates a file content with the values from a field map:
export const getFilledFileContent = (
content: string,
fieldsMap: { [key: string]: any }
): string => {
const timestamp = getTimestamp();
let interpolatedContent = content.replace(
`{{${constants.TIMESTAMP}}}`,
timestamp
);
for (const key of Object.keys(fieldsMap)) {
const regEx = new RegExp(`{{${key}}}`, "g");
let value = fieldsMap[key];
if (value === undefined || value === null) {
value = '';
}
interpolatedContent = interpolatedContent.replace(
regEx,
fieldsMap[key]
);
}
return interpolatedContent;
};
Despite getting you really fast to a first result, this scheme has an obvious flaw: what if you need conditionally include or exclude the field? Or even simpler, how do you omit a field if does not have a value? Of course, you can handle different templates, but the complexity will be killing you very fast. So, we need a better approach.
Second try. Emptying the fields through xml serialization
What if we can somehow strip all the empty fields from the final file? For this we will need to parse the file into a JS object, remove all the empty attributes and nodes and then write it back down.
Parsing an XML file into a JS object is possible thanks to xmlbuilder2 library. The library has a useful example how to do that:
const { create } = require('xmlbuilder2');
const xmlStr = '<root att="val"><foo><bar>foobar</bar></foo></root>';
const doc = create(xmlStr);
// append a 'baz' element to the root node of the document
doc.root().ele('baz');
const xml = doc.end({ prettyPrint: true });
console.log(xml);
After we have the object tree, we can apply a recursive elimination of empty fields and objects:
private removeEmptyProperties(object) {
for (const property in object) {
if (typeof object[property] === "object") {
object[property] = this.removeEmptyProperties(object[property])
}
const isEmptyArray = Array.isArray(object[property]) && object[property].length === 0
const isEmptyObject = typeof object[property] === "object" && Object.keys(object[property]).length === 0
const isPropertyEmpty = object[property] === undefined || object[property] === null || object[property] === ""
if (isPropertyEmpty || isEmptyArray || isEmptyObject) {
delete object[property]
}
}
return object
}
This solution works for empty fields removal. However, there are still 2 major downsides:
- It requires deserialization and serialization again, which comes with a hit on memory and CPU, which is especially important in serverless environment
- It still does not save you from a complex logic
Third and final try. Object serialization
In order to implement a proper solution let's take a look at how Java to XML serialization looks like.
So, JAXB library is using annotations in Java to mark up how an object should be serialized(in java they call it marshalling):
So, some properties will be annotated to use as attributes, some as elements, and some are meant to be skipped(@XmlTransient).
Also please pay attention to the XmlType annotation with propOrder property: frequently it is important to have an appropriate order of fields in the xml document, and DAC7 message is no exception.
After class annotation there should be some code triggering the marshalling itself:
Now, the question is how we could replicate the same API in Typescript. Let's take a look.
Unfortunately, neither JS, nor TS has Reflection API, as Java does. But there is an experimental feature called Decorators allowing to add metadata to fields. Here's the config to enable those:
{
"compilerOptions": {
"experimentalDecorators": true,
"emitDecoratorMetadata": true
}
}
However, this is only the language-level support. We need the API to actually add the metadata. This is the purpose of reflect-metadata package.
Here's how we can create a decorator to define a name of xml attribute:
import "reflect-metadata"
export const xmlAttributeNameMetadataKey = Symbol("xmlAttributeName")
export function Attribute(name: string) {
return Reflect.metadata(xmlAttributeNameMetadataKey, name)
}
Now we can add the decorator to a property in a typescript object:
export class Address {
@Field("dpi:CountryCode")
country: string
@Attribute("legalAddressType")
legalAddressType?: string
}
We are using another decorator - @Field - but it's just the same as the Attribute, just with another key.
Ok, we mapped the objects. It's time to serialize the object tree to xml. For this we need some serialization code as well. The idea is similar to the removal logic: go over the object tree recursively, once seeing a leaf: take the decorators into account and serialize it; then go up. Here's the code:
serializeToXml(obj: any, rootNodeName = "root", parent?: any) {
let root
if (parent) {
root = parent.ele(rootNodeName)
} else {
root = create({version: "1.0", encoding: "UTF-8"}).ele(rootNodeName)
}
const properties = (obj.getPropertyOrder && obj.getPropertyOrder()) || Object.getOwnPropertyNames(obj)
for (const property of properties) {
if (obj[property] === undefined) {
continue
}
const value = obj[property]
const metadataProperty = Reflect.getMetadata(xmlNodeNameMetadataKey, obj, property)
const nodeName = metadataProperty || property
const attributeName = Reflect.getMetadata(xmlAttributeNameMetadataKey, obj, property)
if (attributeName) {
root.att(attributeName, value && value.toString())
} else {
if (this.isObject(value)) {
// For nested objects, create a new node and recurse
this.serializeToXml(value, nodeName, root)
} else if (Array.isArray(value)) {
// Handle arrays by creating multiple nodes with the same name
value.forEach(item => {
if (this.isObject(item)) {
const nodeName = Reflect.getMetadata(xmlTypeNodeNameMetadataKey, item.constructor)
this.serializeToXml(item, nodeName, root)
} else {
// For primitive array items, just add the text content
root.ele(nodeName).txt(item.toString()).up()
}
})
} else {
// this is how you serialize an object with attributes
if (nodeName === "value") {
root.txt(value).up()
} else {
// For primitive values, create a node with text content
root.ele(nodeName).txt(value && value.toString()).up()
}
}
}
}
return parent ? root : root.up() // Return the root node if top-level, else return the current node after attaching its children
}
There are some question here, though. For example, what about the field order? The way how you can solve it, it to provide the a method with the order of the fields:
getPropertyOrder = () => [
"identity",
"relevantActivities",
"docSpec",
]
And that's actually it! Let me know, if you got any questions.
Big thanks to Nikita, Anatoly, Oleksandr, Dima, Pavel B, Pavel, Robert, Roman, Iyri, Andrey, Lidia, Vladimir, August, Roman, Egor, Roman, Evgeniy, Nadia, Daria, Dzmitry, Mikhail, Nikita, Dmytro, Denis and Mikhail for supporting the newsletter. They receive early access to the articles, influence the content and participate in the closed group where we discuss the architecture problems. Join them at Patreon or Boosty!