How to Mitigate Frustrations Around Linking Multiple MS Excel Workbooks With Inconsistent Data Formats

By Hardeep

July 1, 2020 · 10 min read

A Typical Investment Firm Scenario – Inconsistent Data Stored In Multiple MS Excel Workbooks

Does this scenario sound familiar to your investment management firm? At the end of a time period (day, week, month), multiple Excel workbooks are linked together to generate another set of summarized workbooks which are then used to prepare numerous financial reports. This entire process is repeated at the end of each period, or as and when a request comes in from the investment management team or a current or potential investor. Eventually, the investment management firm ends up with tens, if not hundreds, of Microsoft Excel workbooks, storing data in inconsistent formats, authored by several employees from the front, middle and back-office operations, often on a network-accessible drive.

Unfortunately, these workbooks need to be managed very carefully and a change in any workbook can break the entire operations and reporting setup. Also, the complexity of the workbooks increases over time to a point where it is hard to determine the missing piece or the broken link when the summarized financial reports fail or provide wrong analysis. So, how can an investment management firm move away from creating hundreds of workbooks into a more robust, agile, efficient, and error-free operations environment?

A Practical Solution

A practical solution is to integrate a database with the current Microsoft Excel set up.   In this scenario, the relevant data is stored in the database, and Excel is only used to perform data analysis, summaries, and reports. Microsoft Excel has a maximum record limit – and cannot store more than 1,048,000 rows of data records. This means that if you are storing large data sets over long periods – you will eventually exceed this limit and have to use multiple workbooks. Additionally, most investment management firms will separate their workbooks by the smallest time period that makes sense for the investment management firm. In most cases, this means that the fund is storing all the workbooks by daily time period, thereby creating a new folder for each trading day.

When compared to the limitations of Excel, there is virtually no limit to the number of rows that can be stored in a database. By storing relevant data inside a database, it becomes much easier for an investment management firm to manipulate, analyze, and summarize data for reporting to the investment management team, for compliance reporting, and investor reporting. Also, now you can perform advanced analysis using powerful BI tools like Microsoft Power BI.

Using a database ensures that the relevant front, middle, and back-office data are continuously stored in one central location over long periods (years) of time-which ensures uniformity and consistency. With this arrangement, accessing MTD and YTD data is as easy as accessing data for the previous trade date. The user only has to specify the date ranges for the data needed and the database software will retrieve the matching records per the specified criteria. Data backup is also simpler because only a single database containing all data records needs to be backed up.

Databases also allow the creation of data records that are summarized and optimized for investment management firm front, middle, and back-office reporting purposes. This is similar to what is contained in summary workbooks that have aggregated data over specified periods. This data can include summarized portfolio data by fund, geography, client, or internal, as well as counterparty trade settlement data for reconciliation purposes. With the data being retrieved from the database, the analysis for daily, weekly, monthly, and quarterly periods will take the same amount of time and effort. By comparison, if you wanted to perform a similar summarization in Excel, the amount of time and effort needed will increase with the time period for which the analysis needs to be done.

This means that the chance of errors increases significantly as longer periods are selected.    We would also expect differences in data formats between the workbooks – which would make it more difficult to combine data.  For example, an error occurs when one column is defined as a numeric field in one workbook and then defined as a character field in another workbook. Combining the two data types using formulas will fail – and the individual data columns would have to be converted to ensure they match before aggregating them together. 

Conclusion

In summary, by eliminating the use of a single workbook to store portfolio data by fund, geography, investor, etc., for each month – as is typically the case in most investment management firm middle and back offices – resulting in a total of 60 workbooks over 5 years to analyze for each type of summarization and switching to a database to generate a similar summary report your front, middle, and back-office would be much more efficient because you would retrieve, segment, and summarize the data using a single database query to get the desired results, regardless of the time period being selected.