We are looking for someone with a DevOps mindset coupled with a solid foundation in - designing and building ML model pipelines and microservices for big data systems, search & retrieval platform. - building NLP systems, and write RESTful services that interact with other microservices deployed in Kubernetes. In this role, you will design, architect, and code modular search solutions to difficult problems in the information retrieval space. Ideal match is someone who has expertise in search and Lucene based systems, Elasticsearch in particular, sohisticated big data processing systems and wants to bring that experience to the table as we build out back-end support for our next-gen search functionality
• Design and prototype text-based search systems using ElasticSearch.
• Help integrate end-to-end large scale NLP pipeline for processing documents before indexing documents.
• Help develop and integrate a new robust Search DSL with the underlying ElasticSearch platform
• Design, develop and improve public facing APIs and SDKs to interact with the search platform
• Develop and maintain advanced query rewriting functionality
• Familiarity with concepts of entity name matching and linking in news articles in search Systems
• Understand and implement indexation strategy for billions of news articles from multiple sources. Skills
• Experience working with large-scale distributed systems using Python, Spring Boot, Git, Gradle, Spark, and Java
• Strong expertise in Lucene based technologies (Solr, etc) especially ElasticSearch • Solid understanding of Kubernetes, Helm, and Docker
• Develop REST/RPC APIs using FastAPI, gRPC
• Experience with messaging systems such as Kafka and RabbitMQ
• Proficiency with Spark/Python/SQL/JAVA (or similar) and data processing including SQL and PySpark.
• Proficiency with building streaming data pipelines using Kafka, Spark or Flink • Experience implementing scalable, distributed, and highly available systems using AWS Cloud or similar cloud platform.
• Must be able to install, configure, test, and maintains operating systems, application software and system management tools.
• Must be able to formulate and design security system in place to maintain data safety • Must be able to oversee constant availability of application
• Must have familiarity with application development in Cloud (AWS preferred)
• Experience working with big data & distributed systems and computing tools like S3, Hive, MapReduce, and SparkPlus:
• Strong understanding of search and information retrieval including indexing, query rewriting, retrieval and ranking
• Good familiarity in machine learning frameworks like scikit-learn and deep learning frameworks like Pytorch Lightning, Tensorflow Serving, or Keras.
• Familiarity with Natural Language Processing, Text Mining, Recommender Systems.
send your cv email@example.com