Elevating DataTyr’s operations with cutting-edge data integration and AI.
Elevating DataTyr’s operations with cutting-edge data integration and AI.
•
January 17, 2025
•
Read time
DataTyr specializes in aggregating and managing supply chain data, covering sales, prices, and inventory from multiple pharmaceutical wholesalers and vendors across Africa. Operating in markets where unified product codes (UPC) do not exist, DataTyr takes on the challenge of collating and standardizing heterogeneous datasets that come from a variety of ERP systems. Their success depends on providing accurate, up-to-date information to stakeholders, from government regulatory bodies to international pharmaceutical companies.
Over time, the volume of DataTyr’s incoming data grew exponentially, and the lack of a consistent product identification system compounded the complexity. Wholesalers often used their own SKU references, trade names varied from country to country, and different distributors might offer slight variations of the same chemical formulation.
As a result, data quality issues surfaced in several key areas:
Establishing a Robust Entity Resolution Framework
Syntaxia designed an intelligent matching system to correlate trade names and SKUs back to a single “core” product record. Through a combination of string-matching algorithms, reference dictionaries, and heuristic models, DataTyr could see that “Panadol” and “Tylenol” share the same chemical name (Acetaminophen), effectively collapsing scattered records into one unified view.
Backward-Compatible Pricing Updates
Syntaxia introduced functionality to update prices and margins historically, an especially important feature in heavily regulated markets where product prices often change abruptly when government agencies issue new mandates. For example, a wholesaler might have purchased a particular medication at an approved price X, intending to sell it at X with a certain profit margin. If the government then announces a new official price (or margin) that takes effect on a specific date, the wholesaler must adjust inventory records to reflect these updates, often splitting their stock between items purchased under the old price and those subject to the new pricing rules.
To address this, Syntaxia engineered a “retroactive” pricing mechanism that tracks price changes over time without erasing historical accuracy. As soon as a new price goes into effect, DataTyr’s system automatically applies the updated rates to all future transactions while preserving the older price in existing inventory records. This ensures that previously purchased stock is still recorded under its original cost basis and profit structure. Meanwhile, any new purchases and sales adopt the fresh government-mandated price from the specified date onward. The system then “ripples” these updates through relevant data tables to keep every historical record legally compliant, yet perfectly aligned with real-world inventory conditions. All of this happens with minimal manual intervention, preventing the chaos of disconnected system overrides.
BigQuery Foundation and Early Gains
As a first step, Syntaxia built an architecture leveraging Google BigQuery for data warehousing. Custom JavaScript code (running in Google Apps) interfaced with BigQuery SQL to perform intermediate transformations and automation. This shift provided immediate improvements in data transparency and reduced repetitive manual tasks related to data harmonization, entity resolution and reporting.
Migrating to Snowflake and Snowpark
Once Snowflake introduced Snowpark Container Services, Syntaxia replicated the core solution in a new environment. By migrating data pipelines and entity resolution logic inside the warehouse, DataTyr gained a notable performance boost: jobs that used to take hours in the BigQuery + external script model now ran in minutes. This dramatic speed increase allowed DataTyr to handle more complex analytics, onboard new suppliers faster, and reduce operational overhead.
The key advantage of running applications within Snowpark Container Services is that the data never has to leave the warehouse for processing. Traditional approaches often require moving data out to external memory or servers for computation, then piping the results back in which is an extra step that slows performance and adds complexity. By computing everything inside Snowflake, DataTyr streamlined its workflows, minimized data movement, and drastically improved both speed and resource efficiency.
DataTyr’s newly integrated ecosystem resolved what used to be a highly fragmented data landscape. Instead of juggling multiple inconsistent product IDs and pricing records:
By introducing a cohesive entity resolution framework, automated pricing reconciliation, and advanced in-warehouse computation, Syntaxia transformed DataTyr’s chaotic sprawl of pharmaceutical data into a structured, high-performance system. That clarity and precision have pushed DataTyr to a leadership position within African pharmaceutical supply chains. As the company continues to expand into new markets, it can rely on a future-proof data architecture that accommodates regulatory changes, diverse naming conventions, and growing transaction volumes, all while enabling real-time insights and streamlined operations.
Boosting EDO’s productivity through strategic technology & AI recommendations.