Senior Data Engineer- Telecommute Opportunity
New York City, United States
Dataminr
Dataminr’s real-time AI platform detects the earliest signals of high-impact events and emerging risks from within publicly available data.Who we are:
Dataminr puts real-time AI and public data to work for our clients, generating relevant and actionable alerts for global corporations, public sector agencies, newsrooms, and NGOs. Our leading AI platform detects the earliest signals of high-impact events and emerging risks from vast amounts of publicly available information. Our real-time alerts enable tens of thousands of users at hundreds of public and private sector organizations to learn first of breaking events around the world, develop effective risk mitigation strategies, and respond with confidence as crises unfold.
Dataminr is making its mark for growth and innovation, recently earning recognition on the Deloitte’s Technology Fast 500 as well as the Forbes Cloud 100 for six consecutive years. We were also recognized as one of Built In's Best Companies to Work for in 2022 for the second consecutive year.
Join our team and help the world manage risk in real time. You’ll work with 1,000+ talented people across eight offices, united by our passion to collaborate, make a difference, and have fun while doing it!
Who you are:
We’re looking for a Senior Data Engineer to receive and store data from internal engineering teams and external sources in the company’s data lake. Design, develop, and store new source data from a structured, unstructured and semi structured data, APIs. Building Extract Transform Load (ETL) pipelines to ingest data from the data lake into the company’s Data Warehouse. Create, maintain, and troubleshoot existing ETL pipelines. Design, Document, and Build orchestrations for the new ETL jobs. Design, document, and maintain architecture diagrams for systems, processes, and interfaces. Build and Maintain new and existing data ingestion infrastructure. Build and Maintain new and existing continuous integration and continuous deployment (CICD) pipelines. Collaborate with internal engineering teams to ensure data quality and delivery and for new datasets. Create dimension models and build database tables for new databases. Build reports for ad-hoc stakeholder data requests Worked with business stakeholders to translate business needs into engineering requirements. Technologies: Amazon Web Services including (EC2, S3, Kinesis Firehose, ECS), Snowflake, SQL, Apache Kafka, Kafka Connect, Kafka Mirror Maker, Gitlab CI/CD, Apache Airflow, FluentD & FluentBit, Java, Python, Docker, Maven, SpringBoot, Apache Spark, Parquet, AVRO, Protobuf, Linux. May telecommute from any location in the U.S.
Required Skills & Experience:
- This position requires a Master’s degree or equivalent in Computer Science, Computer Engineering, or a related field and 5 years related software development experience.
- Must also have each of the following:
- 3 years of experience designing, building, and developing data lakes and data warehouses with AWS Redshift or Snowflake, experience in data modelling with PostgreSQL or MySQL, and working with Business Intelligence Tools such as Tableau, PowerBI, and Looker;
- 3 years of experience with Amazon Web Services including Elastic Map Reduce, Glue, S3, EC2, RDS (MySQL or PostgreSQL), and Lambda;
- 2 years of experience building streaming data applications with Kafka;
- 3 years of experience working with large datasets and building an Extract-Transform-Load (ETL) pipeline involving diverse data sources and using Apache Airflow to orchestrate the ETL jobs;
- 3 years of experience working with GitHub or GitLab as source code management tool and continuous integration & deployment pipelines; and
- 3 years of experience logging, monitoring, and alerting on system performance and scalability with Grafana and Prometheus.
- Will accept experience gained concurrently.
- Salary: $175,000-$187,500
Why you should work here:
- We recognize and reward hard work with:
- competitive compensation package including company equity.
- paid benefits for employees and their dependents, including medical, dental, vision, disability and life insurance.
- 401(k) savings plan with company matching.
- flexible spending account for out-of-pocket medical, transit, parking and dependent care expenses.
- We want you to be your best, authentic self by supporting you with:
- A diverse, driven, and passionate team of coworkers who want you to succeed.
- Opportunities to own and drive important critical projects.
- Individual Learning and Development fund and professional training.
- Generous leave and flexible hours.
- Daily catered lunch and a fully stocked kitchen.
- And more!
Dataminr is an equal opportunity and affirmative action employer. Individuals seeking employment at Dataminr are considered without regards to race, sex, color, creed, religion, national origin, age, disability, genetics, marital status, pregnancy, unemployment status, sexual orientation, citizenship status or veteran status.
The salary range for this position is as indicated below. Base salary ranges may vary by geographic location, applicant skills, and prior relevant experience, among other factors.
$175,000-$187,500year.
#LI-DNP
#DNP
Tags: Airflow APIs Architecture Avro AWS Business Intelligence CI/CD Computer Science Data quality Data warehouse Docker EC2 ECS Engineering ETL Firehose GitHub GitLab Grafana Java Kafka Kinesis Lambda Linux Looker Map Reduce Maven MySQL Parquet Pipelines PostgreSQL Power BI Python Redshift Snowflake Spark SQL Streaming Tableau
Perks/benefits: Career development Competitive pay Equity Flex hours Flexible spending account Health care Insurance Medical leave Snacks / Drinks Startup environment Team events
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open AI Engineer jobs
- Open MLOps Engineer jobs
- Open Data Science Manager jobs
- Open Data Engineer II jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Manager jobs
- Open Sr Data Engineer jobs
- Open Power BI Developer jobs
- Open Principal Data Engineer jobs
- Open Junior Data Scientist jobs
- Open Data Analytics Engineer jobs
- Open Business Intelligence Developer jobs
- Open Data Scientist II jobs
- Open Senior Data Architect jobs
- Open Product Data Analyst jobs
- Open Business Data Analyst jobs
- Open Sr. Data Scientist jobs
- Open Big Data Engineer jobs
- Open Manager, Data Engineering jobs
- Open Data Analyst Intern jobs
- Open Data Quality Analyst jobs
- Open Junior Data Engineer jobs
- Open Data Product Manager jobs
- Open Azure Data Engineer jobs
- Open ETL Developer jobs
- Open Data quality-related jobs
- Open Business Intelligence-related jobs
- Open ML models-related jobs
- Open Data management-related jobs
- Open GCP-related jobs
- Open Java-related jobs
- Open Privacy-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open APIs-related jobs
- Open Deep Learning-related jobs
- Open PyTorch-related jobs
- Open Snowflake-related jobs
- Open Consulting-related jobs
- Open TensorFlow-related jobs
- Open PhD-related jobs
- Open CI/CD-related jobs
- Open NLP-related jobs
- Open Data governance-related jobs
- Open Kubernetes-related jobs
- Open Airflow-related jobs
- Open Data warehouse-related jobs
- Open LLMs-related jobs
- Open Databricks-related jobs
- Open Hadoop-related jobs