Data Scientist

Job type:


Job Description

Company Profile:

Direct Energy generates electricity and produces natural gas, as well as selling commodities and servicing the energy needs of homes and businesses in 46 U.S. states plus the District of Columbia and 10 Canadian provinces. We also help our customers save on their energy bills through energy efficiency. Located in over 50 locations, our team of 6,000+ employees serve over 6 million residential and commercial customer relationships.

Direct Energy is a subsidiary of Centrica plc (LSE:CNA), one of the world's leading integrated energy companies with over 20 million customers and 34,000 employees worldwide. We are committed to being the most recommended energy and services provider and leading the transition to a low carbon society.

The Data Scientist will work to develop statistical models for prediction, classification, and clustering within Direct Energy. She/He will also develop IT requirements and guide projects to make data available for analytics efforts. She/He will also be responsible for developing, extracting, and maintaining logical and physical data models for data analytics. She/He will also assist the overall governance of the platform – including data integrity, access and roadmap. Specifically, a Data Scientist will have a detailed understanding of predictive and clustering algorithms and have expertise in executing the algorithms within a Big Data environment.

This role can be based in our Iselin, NJ, Pittsburgh, PA or Houston, TX office.


  • University Degree in Mathematics, IT, Engineering, Business, or Science
  • Excellent knowledge of data models and management of data, and data transformation using SQL
  • Experience in Business Analytics, statistical and quantitative analysis, predictive modeling
  • Knowledge of Big Data extraction tools such as HiveQL, Pig, MapReduce, TEZ, and Spark
  • Knowledge of statistical modeling in Python with Pandas, NumPy, and Scikit-learn
  • Knowledge of Spark and MLlib
  • Knowledge of machine learning algorithms such as Random Forests, Gradient Boosting, Neural Networks, and clustering algorithms such as K-means
  • Knowledge of Natural Language Processing and application of NLP to sentiment analysis
  • Knowledge of cluster computing, especially Amazon Web Services
  • Knowledge of Jupyter Notebooks
  • Ability and desire to learn new technologies
  • Ability to communicate and establish good relations with multi-disciplinary teams
  • Resourcefulness and ability to work with limited supervision
  • Flexibility and ability to manage multiple tasks and deadlines
  • Customer focus and results oriented, meeting deadlines


  • Translate complex business issues into achievable analytical learning objectives and actionable analytic projects
  • Analyze data to identify opportunities to improve the customer experience and drive actionable insights
  • Create predictive and clustering models utilizing SQL Server and HDFS/Hadoop data sources
  • Define when predictive or clustering models could be utilized and the type of data required to make them insightful
  • Develop, extract and maintain logical and physical data models for data analytics within Direct Energy
  • Check and maintain data quality / hygiene of the across different systems
  • Design customer-focused data and analytics processes and solutions for business customers
  • In close liaison with Business Analyst, IS developers, and Information Technology team, research and define the sources to pull data and design front end solutions.
  • Provide maintenance to existing solutions, including modifications and bug fixing activities.
  • Respond to continuous changes in the organization to keep accurate and on-time information always available to end users.

The IndividualDirect Energy and its subsidiaries are an Equal Opportunity Employer - EOE AA M/F/Vet/Disability
Additional Website Text