Responsibilities : • Design and implement GPU-accelerated machine learning models (e.g., XGBoost, autoencoders, and GANs using Tesseract) to identify fault patterns in timeseries sensor data. • ...
Responsibilities : • Design and implement GPU-accelerated machine learning models (e.g., XGBoost, autoencoders, and GANs using Tesseract) to identify fault patterns in timeseries sensor data. • ...
AI/ML Vision Engineer
$79K - $106K/yr
OCR-related experience (such as Tesseract, PaddleOCR, EasyOCR, or custom models). * Familiarity with object detection (such as YOLO, Faster R-CNN, SSD, etc.). * Knowledge of classification, feature ...
AI/ML Vision Engineer
$79K - $106K/yr
OCR-related experience (such as Tesseract, PaddleOCR, EasyOCR, or custom models). * Familiarity with object detection (such as YOLO, Faster R-CNN, SSD, etc.). * Knowledge of classification, feature ...
This position requires any amount of experience with the following: applying pattern-matching techniques to build language models and support informed decision-making using EasyOCR and Tesseract ...
This position requires any amount of experience with the following: applying pattern-matching techniques to build language models and support informed decision-making using EasyOCR and Tesseract ...
Design and implement GPU-accelerated machine learning models (e.g., XGBoost, autoencoders, and GANs using Tesseract) to identify fault patterns in timeseries sensor data. * Digital Twin Engineering:
Design and implement GPU-accelerated machine learning models (e.g., XGBoost, autoencoders, and GANs using Tesseract) to identify fault patterns in timeseries sensor data. * Digital Twin Engineering:
This position requires any amount of experience with the following: applying pattern-matching techniques to build language models and support informed decision-making using EasyOCR and Tesseract ...
This position requires any amount of experience with the following: applying pattern-matching techniques to build language models and support informed decision-making using EasyOCR and Tesseract ...
Associate Director - AI Engineering
New York, NY · Hybrid
$150K - $190K/yr
Gemini, llama, gpt etc. and OCR: textract, tesseract etc. * Strong understanding of the use of neural networks, embeddings, transformers etc. * Cloud platforms (AWS SageMaker, Azure ML; etc)
Associate Director - AI Engineering
New York, NY · Hybrid
$150K - $190K/yr
Gemini, llama, gpt etc. and OCR: textract, tesseract etc. * Strong understanding of the use of neural networks, embeddings, transformers etc. * Cloud platforms (AWS SageMaker, Azure ML; etc)
Associate Director - AI Engineering
Princeton, NJ · Hybrid
$150K - $190K/yr
Gemini, llama, gpt etc. and OCR: textract, tesseract etc. * Strong understanding of the use of neural networks, embeddings, transformers etc. * Cloud platforms (AWS SageMaker, Azure ML; etc)
Associate Director - AI Engineering
Princeton, NJ · Hybrid
$150K - $190K/yr
Gemini, llama, gpt etc. and OCR: textract, tesseract etc. * Strong understanding of the use of neural networks, embeddings, transformers etc. * Cloud platforms (AWS SageMaker, Azure ML; etc)
.NET Developer
New York, NY · On-site
$110K - $175K/yr
Experience with OCR tools (e.g., Azure Document Intelligence, Google Document AI, Tesseract) * Skilled in prompt design (zero-shot, few-shot, chain-of-thought) for reliable outputs * Experience ...
Quick apply
.NET Developer
New York, NY · On-site
$110K - $175K/yr
Experience with OCR tools (e.g., Azure Document Intelligence, Google Document AI, Tesseract) * Skilled in prompt design (zero-shot, few-shot, chain-of-thought) for reliable outputs * Experience ...
.NET Developer
$110K - $175K/yr
Experience with OCR tools (e.g., Azure Document Intelligence, Google Document AI, Tesseract) * Skilled in prompt design (zero-shot, few-shot, chain-of-thought) for reliable outputs * Experience ...
.NET Developer
$110K - $175K/yr
Experience with OCR tools (e.g., Azure Document Intelligence, Google Document AI, Tesseract) * Skilled in prompt design (zero-shot, few-shot, chain-of-thought) for reliable outputs * Experience ...
Associate Director - AI Engineering
Princeton, NJ · Hybrid
$150K - $190K/yr
Gemini, llama, gpt etc. and OCR: textract, tesseract etc. * Strong understanding of the use of neural networks, embeddings, transformers etc. * Cloud platforms (AWS SageMaker, Azure ML; etc)
Associate Director - AI Engineering
Princeton, NJ · Hybrid
$150K - $190K/yr
Gemini, llama, gpt etc. and OCR: textract, tesseract etc. * Strong understanding of the use of neural networks, embeddings, transformers etc. * Cloud platforms (AWS SageMaker, Azure ML; etc)
.NET Developer
New York, NY · On-site
$110K - $175K/yr
Experience with OCR tools (e.g., Azure Document Intelligence, Google Document AI, Tesseract) * Skilled in prompt design (zero-shot, few-shot, chain-of-thought) for reliable outputs * Experience ...
.NET Developer
New York, NY · On-site
$110K - $175K/yr
Experience with OCR tools (e.g., Azure Document Intelligence, Google Document AI, Tesseract) * Skilled in prompt design (zero-shot, few-shot, chain-of-thought) for reliable outputs * Experience ...
Associate Director - AI Engineering
Manhattan, NY · Hybrid
$150K - $190K/yr
Gemini, llama, gpt etc. and OCR: textract, tesseract etc. * Strong understanding of the use of neural networks, embeddings, transformers etc. * Cloud platforms (AWS SageMaker, Azure ML; etc)
Associate Director - AI Engineering
Manhattan, NY · Hybrid
$150K - $190K/yr
Gemini, llama, gpt etc. and OCR: textract, tesseract etc. * Strong understanding of the use of neural networks, embeddings, transformers etc. * Cloud platforms (AWS SageMaker, Azure ML; etc)
Associate Director - AI Engineering
New York, NY · On-site
$150K - $190K/yr
Gemini, llama, gpt etc. and OCR: textract, tesseract etc. * Strong understanding of the use of neural networks, embeddings, transformers etc. * Cloud platforms (AWS SageMaker, Azure ML; etc)
Associate Director - AI Engineering
New York, NY · On-site
$150K - $190K/yr
Gemini, llama, gpt etc. and OCR: textract, tesseract etc. * Strong understanding of the use of neural networks, embeddings, transformers etc. * Cloud platforms (AWS SageMaker, Azure ML; etc)
Associate Director - AI Engineering
New York, NY · Hybrid
$150K - $190K/yr
Gemini, llama, gpt etc. and OCR: textract, tesseract etc. * Strong understanding of the use of neural networks, embeddings, transformers etc. * Cloud platforms (AWS SageMaker, Azure ML; etc)
Associate Director - AI Engineering
New York, NY · Hybrid
$150K - $190K/yr
Gemini, llama, gpt etc. and OCR: textract, tesseract etc. * Strong understanding of the use of neural networks, embeddings, transformers etc. * Cloud platforms (AWS SageMaker, Azure ML; etc)
Tesseract information
What are the key skills and qualifications needed to thrive as a Tesseract OCR Specialist, and why are they important?
What are some common challenges faced by professionals working with the Tesseract OCR engine, and how can they be addressed?
What are Tesseract jobs?
What is the difference between Tesseract vs OCR Technician?
| Aspect | Tesseract | OCR Technician |
|---|---|---|
| Required Credentials | Basic computer skills, familiarity with OCR software | Technical training or certification in OCR or image processing |
| Work Environment | Software development, data processing | Data entry centers, document processing facilities |
| Industry Usage | Used by developers for OCR projects | Employed in document digitization and data extraction roles |
| Common Search/Comparison | Yes | Yes |
While Tesseract is an open-source OCR engine used primarily by developers for integrating OCR into applications, OCR Technicians are professionals who operate OCR systems in data entry or document processing environments. Tesseract requires programming knowledge, whereas OCR Technicians focus on manual or semi-automated data extraction tasks.

Full-time
Posted 4 days ago
Caterpillar Inc. rating
7.5
Based on 458 frontline employees who took The Breakroom Quiz
218th of 417 rated machine equipment manufacturers
Job description
Caterpillar Inc. is the world’s leading manufacturer of construction and mining equipment, committed to building a better, more sustainable world. The Lead Data Scientist will drive the development and integration of digital twins and GenAI-assisted predictive analytics for condition monitoring of Caterpillar equipment.
Responsibilities:
• Design and implement GPU-accelerated machine learning models (e.g., XGBoost, autoencoders, and GANs using Tesseract) to identify fault patterns in timeseries sensor data.
• Partner with engineering teams to develop onboard digital twins using NVIDIA architecture (e.g. PhysicsNeMo) to simulate, predict, and optimize the performance of heavy machinery
• Profile and tune deep learning algorithms for maximum efficiency on NVIDIA GPU architectures, ensuring high throughput and low latency for real-time monitoring.
• Adapt and test algorithms for onboard architecture, leveraging tools like NVIDIA Jetson for ROM generation and real-time edge processing on Cat equipment.
• Collaborate with hardware / simulation engineers to ensure algorithm compatibility with next-generation processors and specialized onboard compute modules.
• Use high-fidelity digital twins to simulate rare failure scenarios, ensuring the GenAI assistant provides accurate troubleshooting steps for edge-case mechanical issues.
• Develop Generative AI agents that synthesize telematics data to generate prioritized repairs for identified machine faults.
• Integrate multi-modal outputs from condition monitoring analytics & asset life history to create a machine-specific context for AI assistant.
Qualifications:
Required:
• Typically, a Bachelors, Masters, or PhD degree in Applied Statistics, Data Science, Business Analytics, Predictive Analytics, Business Intelligence & Analytics, Mathematics, Computer Science, Engineering (Aerospace, Electrical, Mechanical, Computer, Industrial, Agricultural, etc.), or equivalent technical degree
• Extensive experience applying Python (NumPy, SciPy, pandas, etc.) programming to solve business challenges.
• Extensive experience with advanced data analysis, machine learning such as clustering, Log regressions, neural nets and statistical methods such as statistical process control, etc. (typically 8+ years)
• Experience in practical applications of onboard architecture / software (e.g. mini projects using Raspberry Pi or any other architecture is a bonus)
• Working experience with heavy equipment engineering or data analysis.
• Working knowledge with cloud technologies (AWS, Azure, Google Cloud, etc.)
• Advanced experience with version control / repositories such as GitHub
• Experience operating in an Agile environment
• Must demonstrate strong initiative, interpersonal skills, and the ability to communicate effectively.
Preferred:
• Generative AI & LLMs: Proficiency in Fine-tuning and Prompt Engineering for Large Language Models, specifically using Retrieval-Augmented Generation (RAG)
• Condition Monitoring Algorithms: Deep understanding of Anomaly Detection, Time-Series Analysis, and Predictive Maintenance models.
• Telematics: Experience handling high-frequency IoT sensor data, CAN bus protocols (J1939), and integrating with unified data platforms
• Experience with High performance computing
• Business Statistics: Extensive experience with statistical tools, processes, and practices to describe business results in measurable scales; ability to use statistical tools and processes to assist in making business decisions.
• Analytical Thinking: Extensive knowledge of techniques and tools that promote effective analysis; ability to determine the root cause of organizational problems and create alternative solutions that resolve these problems.
• Programming Languages: Extensive knowledge of basic concepts and capabilities of applying Python programming to solve business challenges; ability to use tools, techniques and platforms in order to write and modify programming languages.
• Requirements Analysis: Working knowledge of tools, methods, and techniques of requirement analysis; ability to elicit, analyze and record required business functionality and non-functionality requirements to ensure the success of a system or software development project.
Company:
For 100 years, we’ve been helping customers build a better, more sustainable world. Founded in 1925, the company is headquartered in Peoria Heights, USA, with a team of 10001+ employees. The company is currently Late Stage.
What Caterpillar Inc. employees say
Pay
Benefits
Hours and flexibility
Workplace
Get the full story on Breakroom