Building a 3-Layered Database Setup: Advice on Scripting and Data Insertion | Ranjan Kumar

Hey there, fellow data enthusiasts! I’m working on a project that involves setting up a 3-layered database in SSMS, and I’d love to get your advice on the best approach. Here’s the context: we receive daily CSV, TXT, or DSV files from vendors, and we want to create a system that imports all the data, processes it, and finally feeds it into our Power BI environment.

The three layers are designed to handle different stages of data processing. The first layer imports all file data, including duplicates from all vendors. The second layer pulls the most recent data from the files, performs some logic for customer and item types, and removes duplicates. The third layer takes the processed data from layer 2 and prepares it for Power BI.

I’ve already written separate Python scripts using pandas and sqlalchemy libraries to handle the data processing. However, I’m unsure about the best way to pull data into layers 2 and 3. Currently, I’m using a SQL partition statement to pull the latest data into layer 2, and I’m thinking of limiting it to 60 days’ worth of data. For layer 3, I’m truncating the table and reinserting the layer 2 data.

I’m new to this kind of project and lack a CS background, so I’d appreciate any advice on improving my approach. Have you worked on a similar project? What strategies would you recommend for efficient data processing and insertion?

Leave a Comment Cancel Reply