
location_on3155;3158, Cardenas Terrace, Centerville District, Fremont, Alameda County, California, 94536, United States
Bright Vision Technologies is a forward-thinking software development company dedicated to building innovative solutions that help businesses automate and optimize their operations. We leverage cutting-edge technologies to create scalable, secure, and user-friendly applications. As we continue to grow, our team remains dynamic and mission-driven, focused on transforming business processes through technology.
This role is part of an in-house Statement of Work (SOW) engagement. The client, end customer, and employer for this position is Bright Vision Technologies directly; there is no third-party client, vendor, or implementation partner involved. We do not engage in C2C, 1099, or third-party arrangements for this role.
We are seeking an AI Performance Optimization Engineer to join our established and well-respected organization. This position offers tremendous career growth potential within a long-term, multi-year engagement aligned to our delivery roadmap.
In this role, you will focus on extracting maximum throughput, minimizing latency, and reducing cost across training and inference workloads for large neural network systems. The work spans the full stack, from low-level kernel optimization to distributed system tuning. You will require a deep understanding of GPU architecture, model parallelism, memory management, and compiler-level optimization.
The ideal candidate has demonstrated impact on production AI workloads, utilizing strong instrumentation and measurement discipline to enable rigorous, data-driven optimization decisions. You will work closely with cross-functional partners—including product, design, engineering, operations, and business stakeholders—to translate ambiguous requirements into well-engineered solutions. You will be expected to raise the bar through code review, design review, and mentorship of more junior engineers.
For every role, a technical coding assessment is mandatory. Please apply only if you are confident in your technical abilities and hands-on experience.
To apply for immediate consideration, please send your resume to hilda@bvteck.com or contact us at (908) 698-4899. Learn more about Bright Vision Technologies at www.bvteck.com.
Bright Vision Technologies is an Equal Opportunity Employer, including Disability/Veterans. We recognize that our people are our strength, and the diverse talents they bring to our global workforce are directly linked to our success.
We are committed to equal employment opportunity for all employees and applicants without regard to race, color, religion, sex, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, veteran status, or any other protected status as defined by applicable federal, state, or local laws. This commitment extends to all aspects of employment, including recruitment, hiring, training, compensation, promotion, transfer, leaves of absence, termination, layoffs, and recall.
We expressly prohibit any form of workplace harassment or discrimination. We also make reasonable accommodations for applicants' and employees' religious practices and beliefs, as well as mental health or physical disability needs.
Work model: Remote
3155;3158, Cardenas Terrace, Centerville District, Fremont, Alameda County, California, 94536, United States
Fremont, California
Experience optimizing LLM inference at production scale. Contributions to vLLM, TensorRT-LLM, DeepSpeed, or similar projects. Familiarity with custom kernel authoring in Triton or CUTLASS. Experience with FinOps for AI workloads. Publications or talks on AI systems performance.
Experience
6+ yrs (Senior)
Education
Bachelor's degree in Computer Science, Computer Engineering, or related field required
Job Type
Full-Time
Recrutus helps candidates discover roles that match their skills and helps teams reach qualified applicants faster. Browse by metro, discipline, or work style — from internships to senior leadership.