Data Quality Check for Excel Uploads
Uploading Excels with “dirty” data can lead to wrong reports and significant manual efforts in the clean up.
The issue is that Excel will accept any value in cells, there is no control. However the database has been configured to ensure only valid data can be loaded. However, if the spreadsheet is cleaned up according to the following principals, the load will be seamless.
Example: load and maintain asset level reporting
The “record number” in column A is required
Blank rows are allowed
Masterdata clean up
|
|
---|---|
Fund name | must be a valid fund in LPPM. Can’t be empty, can’t be an approximation which is understood by a human but not by the database.
|
Industry | can’t be empty and must be a valid industry sector ID |
Sector | Valid sector codes: Energy, Consumer, Industrial, Healthcare, TMT, Financials, Real Estate, Infrastructure, Other |
Region | is ignored when uploading as region in the model is derived from Country |
Country | Must be a valid country
|
Private/Public | can’t be empty, must contain either Private or Public |
Entry Date | can’t be empty, must be a valid date in dd/mm/yyyy, dd.mm.yyyy or yyyy-mm-dd format. No American date format, no dates with single digit day or month must be formatted as text not value, otherwise Excel converts to date index value |
Exit Date | can be empty but otherwise same format rules apply as for Entry Date must be formatted as text not value, otherwise Excel converts to date index value |
Numeric data clean up
All numeric fields should be unformatted with no thousand separator, no %, etc.
No “-“ for zero, either blank cell or 0
No “NA”, “NaN, “N/A” or any other variants. Either blank or 0
MOIC must be a number eg. 2.5 not 2.5x. In reports we display relevant KPIs as “2.5x” via number formatting. However if a x is appended manually in Excel via data entry this converts the cell from numeric to text and can’t be loaded to a numeric field as it isn’t a number