GSoC 2026: Interested in Project 1 - Build a GUI Agent with local LLM/VLM and OpenVINO #34239

KarSri7694 · 2026-02-21T08:22:52Z

KarSri7694
Feb 21, 2026

Hi, I am Kartikeya Srivastava, a 2nd year B.Tech undergrad in Computer Science and Engineering. I am highly interested in Project 1: Build a GUI Agent with local LLM/VLM and OpenVINO.

I already have PRs submitted as a prerequisite:

Added numpy.diagonal operation to keras openvino backend (Merged)- Numpy.diagonal PR
Added numpy.flip operation to keras openvino backend (Merged)- Numpy.flip PR
Fixed segmentation fault while creating strings using constant operation in openvino python API and added related tests (Under Review)- Related PR
Opened a pull request to fix a small error in windows build documentation I noticed while building openvino on windows platform (Open)- Related PR

I have been building a local OS agent to understand the constraints of Project 1. I successfully engineered a multi-step agentic loop that can autonomously navigate the Windows GUI (demo attached).

Prompt given in demo: open a notepad and type in it; "The text is written by Ambient AI's vision agent"

Note on the demo video: To achieve the reasoning required for this multi-step actions, the prototype relies on Qwen-3-VL-4B, because the OpenVINO openvino_genai pipeline does not yet natively support Qwen-3-VL, this specific demo was temporarily routed through llama.cpp on CUDA to validate the agentic orchestration logic.

demo_video.mp4

I have one question regarding the scope of this project - My primary GSoC objective would be porting this exact agentic loop natively to OpenVINO as well as to enhance the model's state management to successfully perform long-horizon tasks around ~50 steps. When optimizing the VLM for typical AI PCs (e.g., Intel Core Ultra NPUs or Iris Xe iGPUs), what is the strict memory budget we are targeting? Should I optimize a 4B parameter model for maximum performance while still maintaining respectable accuracy by quantizing the model to INT8 or INT4 format and quantizing the KV cache to reduce memory requirements even further or do we have memory budget to use larger, more capable models?

cc: Ethan Yang (@openvino-dev-samples), Zhuo Wu (@zhuo-yoyowz )

openvino-dev-samples · 2026-02-24T00:34:01Z

openvino-dev-samples
Feb 24, 2026

Hi @KarSri7694 Thanks for your interests. We can offer a remote device with 32GB RAM (18GB vRAM) for your development. For model selection, it all depends on you, and we will only evaluate the user experience for final results.

1 reply

KarSri7694 Feb 24, 2026
Author

@openvino-dev-samples Thank you for confirming the specifications! The 18GB VRAM capacity is excellent will allow me to target more capable models like Qwen3-VL-8B (INT8). I will definitely include a detailed utilization plan for this device in my formal proposal.

KarSri7694 · 2026-03-19T18:35:36Z

KarSri7694
Mar 19, 2026
Author

Hi Ethan (@openvino-dev-samples ) and Zhuo (@zhuo-yoyowz)

I’ve just sent you an email with the draft of my GSoC proposal for the project.

Whenever you have time, I would greatly appreciate any feedback before I submit the final proposal on the GSoC website.

Thanks for your time!

0 replies

KarSri7694 · 2026-03-27T14:26:50Z

KarSri7694
Mar 27, 2026
Author

Hey, Ethan (@openvino-dev-samples) and Zhuo (@zhuo-yoyowz)

Can you please look at my question:- #34555

I have created a code that implements get_state and set_state even when KV cache is quantized to q8_0, my question is:
Viewing the following code in src/plugins/intel_gpu/src/plugin/multi_tensor_variable_state.cpp

void VariableStateIndirectKVCacheCompressed::set_state(const ov::SoPtr<ov::ITensor>& state) {
    OPENVINO_THROW("[GPU] set_state API is supported only when KV-cache compression is disabled");
}

ov::SoPtr<ov::ITensor> VariableStateIndirectKVCacheCompressed::get_state() const {
    OPENVINO_THROW("[GPU] get_state API is supported only when KV-cache compression is disabled");
}

Is the code OPENVINO_THROW intentional and expected behaviour, or is it unimplemented?
If it is unimplemented, I can push my code in a PR that implements get_state and set_state when KV cache quantized.

Also, currently my code only works when the KV cache is quantized to int8, should i also implement this for 4 bit KV cache quantitation?

1 reply

openvino-dev-samples Mar 30, 2026

Hi, please concentrate on the proposal preparation, instead of implementation at this stage. We can discuss it once your project starts.

KarSri7694 · 2026-03-30T19:33:59Z

KarSri7694
Mar 30, 2026
Author

Hi @openvino-dev-samples and @zhuo-yoyowz

I have submitted my final proposal through the official GSoC portal. Looking forward to work under your guidance in the coming summer.
Thanks again for clearing my doubts and answering my questions throughout this period.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GSoC 2026: Interested in Project 1 - Build a GUI Agent with local LLM/VLM and OpenVINO #34239

Uh oh!

{{title}}

Uh oh!

Replies: 4 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

GSoC 2026: Interested in Project 1 - Build a GUI Agent with local LLM/VLM and OpenVINO #34239

Uh oh!

KarSri7694 Feb 21, 2026

Replies: 4 comments · 2 replies

Uh oh!

openvino-dev-samples Feb 24, 2026

Uh oh!

KarSri7694 Feb 24, 2026 Author

Uh oh!

Uh oh!

KarSri7694 Mar 19, 2026 Author

Uh oh!

KarSri7694 Mar 27, 2026 Author

Uh oh!

openvino-dev-samples Mar 30, 2026

Uh oh!

KarSri7694 Mar 30, 2026 Author

KarSri7694
Feb 21, 2026

Replies: 4 comments 2 replies

openvino-dev-samples
Feb 24, 2026

KarSri7694 Feb 24, 2026
Author

KarSri7694
Mar 19, 2026
Author

KarSri7694
Mar 27, 2026
Author

KarSri7694
Mar 30, 2026
Author