State Street logo

Senior Data Lakehouse Architect (Databricks), Vice President

State Street
2 days ago
Full-time
On-site
Boston, Massachusetts, United States
$120,000 - $202,500 USD yearly

Senior Data Lakehouse Architect (Databricks), Vice President

Corporate Functions Technology

Who We Are Looking For

We are seeking a Senior Data Lakehouse Architect to design and lead the build-out of a Legal Data Lakehouse platform on AWS and Databricks. This role will drive the architecture, engineering, and governance of scalable, secure, and compliant data capabilities supporting legal operations, contract intelligence, eDiscovery, and AI/ML use cases.

The ideal candidate brings deep expertise in Databricks, AWS data platforms, and enterprise data architecture, with experience delivering solutions in regulated environments aligned to security, compliance, and audit requirements.

Why This Role Is Important to Us

State Street’s Legal function operates across a broad set of contracts, matters, regulatory obligations, documents, and workflows that are distributed across multiple systems and formats. Building a modern Legal Data Lakehouse is critical to creating a trusted, governed foundation that brings these data sources together—making legal information easier to access, analyze, and use at scale.

This role is critical to establishing a secure and scalable data foundation that enables legal analytics and AI use cases while strengthening governance, auditability, and global consistency across Legal.

What You Will Be Responsible For

1. Architecture & Platform Design

  • Define and implement the end-to-end Legal Data Lakehouse architecture using Databricks (Delta Lake, Unity Catalog, Workflows) on AWS
  • Design multi-layered data architecture (Bronze, Silver, Gold) to support:
    • Contract metadata and document ingestion
    • Legal matter management data
    • eDiscovery datasets
    • External regulatory and compliance feeds
  • Establish scalable ingestion frameworks (batch and streaming) for structured and unstructured legal data (PDFs, contracts, emails)

2. Data Engineering & Integration

  • Lead development of ETL/ELT pipelines using Databricks, Spark, and Python/SQL
  • Integrate with enterprise platforms, including:
    • Contract lifecycle management systems
    • AI platforms and LLM pipelines
    • Document repositories and enterprise content systems
  • Design patterns for extracting structured data from unstructured legal documents and persisting into Delta Lake
  • Enable downstream integration with enterprise data platforms, analytics tools, and AI/ML pipelines

3. Governance, Security & Compliance

  • Implement data governance frameworks using Databricks Unity Catalog and AWS-native controls (IAM, KMS)
  • Establish:
    • Fine-grained access controls (row/column-level security)
    • Data lineage and auditability
  • Ensure compliance with:
    • Data privacy regulations (e.g., GDPR)
    • Internal security and audit requirements
  • Partner with IAM teams to integrate with enterprise identity providers (e.g., Entra ID / Azure AD)

4. AI/ML & Advanced Analytics Enablement

  • Architect data models supporting:
    • Contract analytics, clause extraction, and obligation tracking
    • Legal AI use cases (contract review, litigation insights, compliance monitoring, legal spend analytics)
  • Design search and retrieval architectures (RAG) for enterprise legal knowledge bases
  • Enable entity extraction and knowledge graph frameworks
  • Integrate with LLM/GenAI platforms to support capabilities such as document summarization, Q&A, and workflow automation

5. DevOps & Platform Operations

  • Establish CI/CD pipelines and infrastructure-as-code (Terraform, Git-based workflows)
  • Define standards for:
    • Code quality and versioning
    • Environment promotion (Dev / QA / Prod)
  • Implement observability and alerting for platform health and reliability

6. Leadership & Stakeholder Engagement

  • Partner with Legal and Technology leadership to define platform roadmap and priorities
  • Provide architectural governance and design oversight
  • Mentor data engineers and platform teams
  • Translate business and legal requirements into scalable, enterprise-grade solutions
  • Operate within a federated data and platform model, collaborating across engineering, security, and domain teams

What We Value

The skills that will help you succeed in this role include:

  • 10+ years of experience in data architecture, engineering, or analytics platforms
  • 5+ years of hands-on experience with Databricks and Apache Spark
  • Strong experience with AWS-based data platforms
  • Expertise in data governance, security, and compliance in regulated environments
  • Experience working with unstructured data and NLP/document processing pipelines

Education & Preferred Qualifications

Education

  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, Information Systems, or a related technical discipline
  • Relevant certifications strongly preferred:
    • Databricks Certified Data Engineer / Architect
    • AWS Certified Solutions Architect (Associate or Professional)

Preferred Qualifications (Core – Databricks, AWS, Data Lakehouse)

  • Strong hands-on experience with the Databricks Lakehouse platform, including Delta Lake, Unity Catalog, Workflows, and MLflow
  • Deep expertise in AWS data platform services, including S3, Glue, EMR, Lambda, Redshift, and IAM
  • Proven experience architecting and delivering enterprise-scale data lakehouse solutions on AWS using Databricks
  • Advanced proficiency in Apache Spark (PySpark/Scala), SQL, and Python
  • Strong understanding of data governance and security, including access controls, metadata management, and encryption (KMS, CMK/BYOK)
  • Experience building end-to-end data pipelines (batch and streaming) and supporting AI/ML workloads within a lakehouse architecture

Nice to Have (Domain & Industry Experience)

  • Experience in Legal, Compliance, Financial Services, or other regulated industries
  • Understanding of legal data constructs, including contracts, clauses, obligations, and matters
  • Experience supporting legal AI use cases (contract analytics, document summarization, compliance monitoring)
  • Experience handling sensitive data in highly regulated, audit-driven environments

Salary Range:

$120,000 - $202,500 Annual

The range quoted above applies to the role in the primary location specified. If the candidate would ultimately work outside of the primary location above, the applicable range could differ.

Employees are eligible to participate in State Street’s comprehensive benefits program, which includes: our retirement savings plan (401K) with company match; insurance coverage including basic life, medical, dental, vision, long-term disability, and other optional additional coverages; paid-time off including vacation, sick leave, short term disability, and family care responsibilities; access to our Employee Assistance Program; incentive compensation including eligibility for annual performance-based awards (excluding certain sales roles subject to sales incentive plans); and, eligibility for certain tax advantaged savings plans.

For a full overview, visit https://hrportal.ehr.com/statestreet/Home.

About State Street

Across the globe, institutional investors rely on us to help them manage risk, respond to challenges, and drive performance and profitability. We keep our clients at the heart of everything we do, and smart, engaged employees are essential to our continued success.

We are committed to fostering an environment where every employee feels valued and empowered to reach their full potential. As an essential partner in our shared success, you’ll benefit from inclusive development opportunities, flexible work-life support, paid volunteer days, and vibrant employee networks that keep you connected to what matters most. Join us in shaping the future.

As an Equal Opportunity Employer, we consider all qualified applicants for all positions without regard to race, creed, color, religion, national origin, ancestry, ethnicity, age, disability, genetic information, sex, sexual orientation, gender identity or expression, citizenship, marital status, domestic partnership or civil union status, familial status, military and veteran status, and other characteristics protected by applicable law.

Discover more information on jobs at StateStreet.com/careers

Read our CEO Statement

Job Application Disclosure:

It is unlawful in Massachusetts to require or administer a lie detector test as a condition of employment or continued employment. An employer who violates this law shall be subject to criminal penalties and civil liability.