Job Information
ARCOS LLC Data Architect in United States
ABOUT US
We are ARCOS, an innovative SaaS company dedicated to transforming critical infrastructure industries. We are embarking on a multi-year project that is focusing on scalability, performance, and future innovation. We thrive on data-driven decision-making and are looking for a hands-on Data Architect to help build and shape the foundation of our data infrastructure for the next stage of our growth, with a keen focus on event-driven architectures, vector databases, while also exploring the potential for innovative data applications.
POSITION SUMMARY
As a hands-on Data Architect, you will be crucial in designing, building, and optimizing the data architecture for our next-generation SaaS platform. This position requires expertise in event-driven data architectures (e.g., Apache Kafka) and emerging technologies such as vector databases to support Generative AI applications. You will be deeply involved in implementing scalable, high-performance data systems that drive real-time analytics, AI applications, and dynamic data processing. Experience in the Utility industry and knowledge of AWS is preferred as we seek to optimize data systems that cater to the unique demands of this sector.
The ARCOS development organization leverages Java, Spring Boot, AWS RDS (Postgres, SQL Server), Oracle, AWS Serverless technologies (Lambda, SQS), REST, JavaScript, and Mobile development with React Native hosted in AWS using Atlassian tools (Jira, BitBucket, and Confluence).
ESSENTIAL JOB FUNCTIONS
Duties and Responsibilities
Design & Build Data Infrastructure: Architect and implement scalable, high-performance data infrastructure focusing on event-driven architectures, real-time data streaming, and advanced AI-driven applications.
Event-Driven Data Solutions: Develop event-driven systems leveraging tools like Apache Kafka or similar technologies to support real-time data processing and low-latency pipelines.
Hands-on Development: Actively develop and maintain data pipelines, ETL/ELT processes, and event-streaming solutions using Apache Kafka, Apache Flink, Apache Spark, or similar tools, as well as AI-specific data systems.
Database Management: Manage and optimize SQL, NoSQL, OLAP and vector databases to ensure high availability, scalability, and performance, leveraging deep knowledge of database internals, mastery of concepts such as partitioning, sharding, embeddings, distributed database systems, and change data capture (CDC) techniques to drive efficiency and reliability across complex, large-scale environments.
Data Integration: Build real-time and batch data pipelines that integrate structured and unstructured data from various sources, including AI models and third-party data sources.
Performance Tuning: Continuously monitor and optimize data systems for performance, ensuring that AI workloads are supported by highly efficient data pipelines and storage solutions.
Collaboration: Work closely with product managers, software engineers, and data scientists to align event-driven architectures, vector databases, and data pipelines with the needs of AI and machine learning models.
Cloud Architecture: Architect and manage cloud-based data solutions (AWS preferred, GCP, or Azure) that support distributed data processing, AI workloads, and real-time data streaming.
Vector Databases: Design and implement vector-based databases (e.g., Pinecone, pg_vector, Milvus) to support machine learning models, including Generative AI applications, efficiently handling high-dimensional data such as embeddings and unstructured data.
Required Qualifications:
Bachelor’s degree in Computer Science, Mathematics, Electrical Engineering, or equivalent knowledge and experience.
7+ years’ experience in Data Architect or in a similar data engineering role, with direct involvement in designing and implementing event-driven architectures.
Expertise in vector databases (e.g., Pinecone, Weaviate, Milvus) and their application in Generative AI and other machine learning models, including managing high-dimensional data and embeddings.
Strong understanding of Generative AI applications and how to build data pipelines and infrastructure to support them.
Proficiency in programming (Python, Java, or similar languages) with the ability to write clean, efficient code for event-driven data pipelines and AI-driven data architectures.
Experience with real-time data streaming, ETL/ELT processes, and tools like Apache Kafka, Apache Flink, Kinesis, etc.
Extensive experience with cloud-based data architectures and distributed systems.
Deep understanding of database technologies (SQL, NoSQL, OLAP, vector) and performance optimization for AI workloads.
Strong problem-solving skills and a hands-on approach to addressing technical challenges.
Preferred Qualifications:
Experience in the Utility industry is a plus.
Experience in the SaaS industry or building scalable data systems for AI-powered products.
Familiarity with modern data visualization tools (e.g., Tableau, Looker) and BI platforms.
Experience with machine learning models, especially in Generative AI, and advanced analytics.
COMPANY
At ARCOS, we believe in fostering a culture of ownership, accountability, and teamwork. We value the collective strength of our team and understand that our success results from our collaborative efforts. We're not just looking for employees; we're seeking partners in our mission. If you take pride in your work, are always eager to learn and grow, and believe in the power of teamwork, we want you on our team.
BENEFITS
You will be eligible to participate in ARCOS health benefits to include (100% employer-paid dental and vision premiums for single coverage), 401(k) with company match, generous PTO plan, and a technology stipend just to name a few. Please visit our Careers page (www.arcos-inc.com/careers) to learn more about all of these great benefits.