CSV, XML and JSON Data
|Data Source||Description||Important Considerations||Example|
CSV files are Comma Serparated Value format files; however, the delimiter doesn't necessarily have to be a comma, as we often see semi-colons or other delimiters used.
||Data within CSV files should be properly escaped and properly enclosed. For example, if the delimiter is a comma, but a value within the file contains a comma (eg, color =
reg, orange, blue), the value "red, orange, blue" should be enclosed, typically with double quotes. But because a value could have a double quote within it, eg 1/8" would mean
1/8 of an inch, it needs to be escaped with a backslash or other escape character.
Note that even if your data doesn't conform to this, we can still work with. Typically we would perform a cleansing operation to fix the missing enclosures or escape characters via an algorithm.
|XML||XML files are eXtensible Markup Language files. Many systems will import and export XML files. If a system's preferred method for data interaction is XML, then that system will also typically have an XSD (XML Schema Definition) file, which describes the structure of the XML files, including what elements are allowed, where they are allowed, how many times, and the same for attributes, etc.||It's important to ensure that the XML is both well-formed (meaning it's actually a complete XML document, without any syntax errors), as well as valid (if there is an XSD to validate the data against). Believe it or not, we have actually seen cases where exports from enterprise systems produce XML documents that are either invalid, or not well formed (in certain rare cases). While most service providers would not perform an analysis and overlook these outliers, we always perform an analysis on all data sets provided to us, and we would be able to locate and find these cases, so that we can better handle them during any service work we provide for our clients.||<items> <item id="0001" type="donut"> <name>Cake</name> <ppu>0.55</ppu> <batters> <batter id="1001">Regular</batter> <batter id="1002">Chocolate</batter> <batter id="1003">Blueberry</batter> </batters> <topping id="5001">None</topping> <topping id="5002">Glazed</topping> <topping id="5005">Sugar</topping> <topping id="5006">Sprinkles</topping> <topping id="5003">Chocolate</topping> <topping id="5004">Maple</topping> </item> </items>|
Back to top