khanh nguyen
Central Member Data Service
Introduction
"Good afternoon, everyone. Today, I'm excited to share with you the journey and outcomes of a pivotal project I led at SoFi, known as the 'Central Member Data Service.' This initiative was aimed at addressing the fragmentation of member data across our platforms, a challenge that not only impacted our operational efficiency but also our ability to innovate and deliver seamless services to our members."
Problem Statement
The core issue we faced at SoFi was multifaceted. We had multiple services managing member data, but no centralized expertise or system to ensure consistency and accessibility. This led to duplicated data objects across services and confusion about where to find authoritative data. Teams like Marketing, Communication, and Data Analysis struggled to access reliable information, impeding their ability to execute their functions effectively. Simply put, there was no single source of truth for member data, affecting everything from marketing initiatives to fraud detection.
Project Goals and Constraints
Our objective was clear: to unify member data into a single, efficient, and easily accessible service. This meant creating a system that was not only fast and flexible but also consistent across all organizational departments and extendable for future integrations. However, we faced constraints such as the existing cache data setup and the need for seamless integration with various internal tools like Braze, as well as accommodating the diverse data requirements of teams from marketing to fraud detection.
High Level Technical
The service was conceptualized to meet four key criteria:
Performance (sub-100 ms response times),
Flexibility (enabling engineers to access precisely the data they need without extraneous information),
Consistency (ensuring data uniformity across banking, investment, and other departments)
Extensibility (simplifying the process for other teams to integrate their data).
To achieve these goals, I intend to keep the design of a high-level architecture powered by GraphQL, utilizing Spring GraphQL for essential features like DataLoader and BatchLoader. This setup grants clients the flexibility to request specific data sets.
At the build stage, the service employs a tool to process YAML configuration files from various teams. These files detail essential API information, including team contacts, Slack channels for on-call communication, any personally identifiable information (PII) data, and the data model required for service integration. Should the configuration not specify certain data, the default action is to incorporate all available data from the response. The system is designed to dynamically generate data models, database schemas, and GraphQL schemas, integrating them into a master schema. Additionally, it creates listeners for Kafka topics dedicated to data change capture, leveraging Debezium for real-time data streaming from API databases to Kafka. This allows us to extract and store data directly from Kafka messages into our database. The service also facilitates real-time data queries through GraphQL resolve functions and supports data subscriptions for updates, with changes originating from the data source itself. Configuration-dependent, new jobs may be queued for data backfill, providing a comprehensive solution for data management.
Technology Choices and Trade-offs:
The decision to use GraphQL was driven by the need for data retrieval flexibility. Despite the team's initial unfamiliarity with GraphQL and the challenges associated with schema stitching and data loader mapping, this technology was deemed the best fit for our requirements. We opted for PostgreSQL due to its widespread support within the company, cost-effectiveness, and established infrastructure, including Flyway setup. However, this choice introduced complexities in generating data models without using join tables, a hurdle we managed to overcome.
Kafka was selected for event data handling to minimize latency and facilitate efficient data integration into our database, adding a layer of complexity to the project's data management strategy.
Project Rollout Challenges:
The project faced initial hesitation in selecting PostgreSQL, stemming from concerns over data model mapping complexities. Through a proof of concept (POC), I demonstrated the viability of our approach, alleviating these concerns. Data backfill posed another significant challenge, necessitating the denormalization of certain data sets to fit our database schema.
Overall, the project's implementation involved navigating various technical and organizational challenges, from adopting new technologies to refining data management practices, culminating in a robust service that significantly enhances data accessibility and consistency across SoFi.
My Role
Project Leadership and Architectural Design: I spearheaded the development of the Product Requirement Document (PRD) and crafted the high-level architecture to address existing challenges. This involved a proactive approach to understand and consolidate requirements across multiple teams.
Lead the technical and team mentor: I championed the adoption of GraphQL, emphasizing its critical role in achieving data retrieval flexibility. My advocacy was instrumental in aligning the team with the benefits of GraphQL, ensuring its integration into our architectural design. I led the charge in refining the project's detailed design, actively presenting our high-level architectural vision to the platform organization for feedback. This process was crucial for securing buy-in and adjusting our approach based on constructive feedback.
Cross-functional Collaboration: Engaging in extensive discussions with various teams within the organization, I gathered insights into their specific pain points. This collaborative effort extended to stakeholders from different departments, aiming to understand their needs and how the service could support them. My interactions helped shape the PRD, focusing on addressing these challenges effectively.
Stakeholder Engagement: In partnership with a distinguished engineer and my manager, I facilitated the expansion of the PRD. Together, we navigated through the project's intricacies, ensuring all concerns were meticulously addressed. My role involved direct communication with six different teams, gauging their interest and securing their commitment to onboard from the project's inception.
Proof of Concept (POC): To demonstrate the feasibility of our proposed solutions and guide the team through uncertain decisions regarding database selection and GraphQL library adoption, I developed and presented a POC. This not only showcased the potential of our architectural design but also provided a clear direction for the team as they delved into the specifics of database and GraphQL library selection.
Results and Reflections
The service is now operational, handling 1.2 million requests daily with a 90th percentile response time of 80ms. Launching this project presented numerous challenges, including assembling a new team, defining product requirements, and advocating for the service's design and objectives. Despite these hurdles, the project has significantly benefited various teams within the member organization, enabling them to access and use data consistently. Moreover, it has empowered other parts of the organization to build upon and innovate with the foundation we've established. The satisfaction from overcoming these challenges and seeing the positive impact on the organization is immense.