Компания "КЛАРИТИ / CLARITY"
Graip.AI is low-code intelligent document processing (IDP) platform that automates document processing with close to zero IT involvement, includes AI models suite to sеlесtfrom based on customer need.
REQUIREMENTS:
- 5+ years of experience in Python development, with a strong focus on backend and system design.
- Proficient with Python frameworks like Flask, Django or FastAPI.
- Hands-on experience with containerization technologies such as Docker, and orchestration tools like Kubernetes.
- Experience with cloud platforms (AWS, GCP, or Azure).
- Familiarity with CI/CD pipelines for deploying Python services in production.
- Strong experience with asynchronous programming using Python (e.g., asyncio, aiohttp).
- Experience using task queues like Celery or other distributed task management tools.
- Expertise in SQL, with a focus on performance optimization and data storage strategies.
- Proficient in pytest for unit and integration testing.
- Extensive experience with refactoring legacy code for improved performance and maintainability.
- Experience working with document formats like PDFs, images, and various structured/unstructured data formats.
- Understanding of data structures, algorithms, and best practices for clean, efficient code.
CONSIDERED AS ADVANTAGE:
- Experience with Machine Learning concepts and practical experience with ML frameworks like scikit-learn, XGBoost, PyTorch, or TensorFlow.
- Knowledge of Natural Language Processing (NLP), specifically around document parsing and text extraction (e.g., spaCy, NLTK, or transformers).
- Experience in building or maintaining ML/ETL pipelines using tools like Airflow or MLFlow.
- Prior experience with MLOps practices for deploying machine learning models at scale.
- Experience with Large Language Models (LLMs) and applications such as Retrieval-Augmented Generation (RAG) for document or text-based workflows.
- Data analysis skills, with experience in building analytical tools or services to extract insights from document datasets.
RESPONSIBILITIES:
- Develop, maintain, and optimize a scalable platform for automated document processing.
- Enhance data preprocessing pipelines and text extraction modules.
- Refactor legacy code to improve maintainability, performance, and scalability.
- Implement asynchronous programming techniques to improve system performance.
- Ensure code quality, test coverage, and maintainability through unit testing, integration testing, and code reviews.
- Debug, troubleshoot, and fix issues in both development and production environments.
- Participate in architecture and design discussions to improve the platform’s scalability and performance.