STEMtech logo

Data Engineer

STEMtech
Full-time
On-site
Atlanta, Georgia, United States

Company Description

STEMtech has a client in the Atlanta area looking for a Data Engineer who will aid in the optimization of operations by manipulating and aggregating the disparate operational and back office data sources into a format that is easily digestible by both data scientists and statistically adept colleagues.  His/her core responsibility will be to combine large volumes of disparate complex data, conduct quality checks on the data, manipulate the data and ensure continuous access to a clean format of the operational data for data scientists and other stakeholders. In addition, he/she will also assist in developing the data pipeline to ensure ongoing data collection, consolidation, and management

Job Description

·         Create data ingestion pipeline and processes based on jointly defined requirements

·         Profile and analyze data to identify gaps and potential data quality issues; works with business SME’s to resolve these issues

·         Identifies relationships between disparate data sources

·         Uses Python, “R”, Informatica, and other Big Data tools and technologies to code the data Engineering routines

·         Designs and develops the Data Engineering routines for feature extraction, feature generation and feature engineering

·         Works with the group of data scientists and business SMEs to get the requirements and present the details in data

·         Designs and jointly develops the data architecture with data architect and ensures security and maintenance

·         Explores suitable options, designs, and creates data pipeline (data lake / data warehouses) for specific analytical solutions 

·         Identifies gaps and implements solutions for data security, quality and automation of processes

·         Builds data tools and products for effort automation and easy data accessibility

·         Supports maintenance, bug fixing and performance analysis along data pipeline

·         Diagnoses existing architecture and data maturity and identifies gaps

·         Gather requirements, assess gaps and build roadmaps and architectures to help the analytics driven organization achieve its goals

Qualifications

·         8-10 years of experience in data engineering and Data Lake using any Hadoop ecosystem

·         Bachelor’s Degree in Computer Science, Engineering, and/or background in Mathematic and Statistics; Master’s or other advanced degree a plus

·         Previous leadership experience

·         Experience on Big Data platforms (e.g. Hadoop, Map/Reduce, Spark, HBase, HDInsight, Data Bricks, Hive) and with programming languages like UNIX shell scripting, Python etc.

·         Has used SQL, PL/SQL or T-SQL with RDBMSs like Teradata, MS SQL Server, Oracle etc in production environments

·         Experience with reporting and BI packages e.g. PowerBI, Tableau, SAP BO etc.

·         Strong critical thinking and problem solving skills

·         Success at working on cross-functional teams to meet a common goal

·         Self-starter with a high sense of urgency

Additional Information

All your information will be kept confidential according to EEO guidelines.