airisDATA Inc., is a startup system integration company focused primarily on Hadoop data engineering and machine learning services. We take pride in our team of passionate data engineers, machine learning specialists, and data administrators who build, run, and deploy data solutions on large scale Hadoop clusters.
Position overview
Position: Data Scientist
Terms: Full Time Permanent
Location: Princeton, NJ; New York, NY.
Work Status: Authorized to work in the U.S without Sponsorship
Job Challenge
Do you love large scale data technologies that shape the next generation of data driven business?
Are you an individual with an entrepreneurial spirit, a can do attitude, engineer at heart, and enjoys the adrenaline rush in solving a business challenge? If yes, we would love to chat with you about your passion, vision, and experience.
What could you expect?
- A dynamic role with a diverse set of customers, various industry domains, and challenging solutions
- Flat organizational structure with significant opportunity for advancement and leadership
- Open culture where commitment, respect, and open communication are essential
- Perfect balance of creativity, innovation, analytical thinking, and rolled-up sleeves
Requirements
Our ideal candidate has a background in distributed applications, data warehousing, and databases. The candidate is highly proficient in hadoop ecosystem, and Apache Spark big data stack. The candidate has strong programming skills in Scala, Java, Python, SQL and has an expertise in statistical algorithms for data analysis.
- M.S in Computer Science, Statistics, Machine learning or Applied Math. PhD preferred.
- 2+ years in ML algorithmic design, predictive modeling, and recommender systems.
- Experience implementing Regression, Bayesian, Decision Trees, Random Forests, SVM, Clustering, Instance based methods, Association Rules, Dimensionality Reduction etc.,
- Strong hands-on programming experience in R, Python, Java (Scala a plus)
- Experience using Apache Hadoop 2.0 ecosystem and Apache Spark a plus
- Familiarity with open source Python based ML libraries and H2O are a plus
- Top-down thinker, excellent communicator, and great problem solver
Nice to Have
- Knowledge of big data architectures and technologies (Spark preferred).
- Expertise in leading cloud technologies like Amazon Web Services.
- Certified in Hadoop (Cloudera or Hortonworks) and Apache Spark.
- Experience in any NoSQL databases including HBase, Accumulo, Cassandra and Neo4J.
- Domain expertise in a couple – Telecommunication, Digital Media, Digital Advertising, Finance, Retail.
Benefits
- Competitive salary
- Bonus
- Stock options
- Medical insurance
- Paid Vacation
- Mac equipment