Skip to content

Comments

[Feat] Add support for MAA ocr#274

Open
TheNotoBarth wants to merge 4 commits intoDearVa:mainfrom
TheNotoBarth:feature/ocr
Open

[Feat] Add support for MAA ocr#274
TheNotoBarth wants to merge 4 commits intoDearVa:mainfrom
TheNotoBarth:feature/ocr

Conversation

@TheNotoBarth
Copy link
Collaborator

No description provided.

Copilot AI review requested due to automatic review settings January 29, 2026 09:49
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for MAA (MaaFramework) OCR text recognition capabilities to the Visual Tree Debugger. The implementation includes downloading OCR models on-demand, performing text recognition on screen captures, and displaying results in a new UI tab.

Changes:

  • Added IOcrService interface and MaaOcrService implementation using MaaFramework for OCR text recognition
  • Extended VisualTreeDebugger UI with an OCR button and results tab to display recognized text
  • Added Maa.Framework NuGet package dependency (version 5.3.0)

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
Directory.Packages.props Added Maa.Framework package version 5.3.0
src/Everywhere.Core/Everywhere.Core.csproj Referenced Maa.Framework package
src/Everywhere.Core/Interop/IOcrService.cs Defined OCR service interface with async methods for model management and recognition
src/Everywhere.Core/Interop/MaaOcrService.cs Implemented OCR service using MaaFramework with model downloading, caching, and recognition logic
src/Everywhere.Core/Views/Controls/VisualTreeDebugger.axaml Added OCR button to toolbar and new OCR tab for displaying recognition results
src/Everywhere.Core/Views/Controls/VisualTreeDebugger.axaml.cs Implemented OCR button click handler with element selection and result display
src/Everywhere.Windows/Program.cs Registered MaaOcrService as singleton in DI container

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 140 to 155
ZipFile.ExtractToDirectory(zipFile, _ocrDir, true);

foreach (var requiredFile in OcrRequiredFiles)
{
var targetPath = Path.Combine(_ocrDir, requiredFile);
if (File.Exists(targetPath)) continue;

var foundFiles = Directory.GetFiles(_ocrDir, requiredFile, SearchOption.AllDirectories);
if (foundFiles.Length > 0)
File.Move(foundFiles[0], targetPath);
}

foreach (var dir in Directory.GetDirectories(_ocrDir))
Directory.Delete(dir, true);

File.Delete(zipFile);
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the ZIP extraction fails after the file is created but before extraction completes, or if extraction succeeds but the subsequent file operations fail, the zip file will not be deleted. Consider adding a try-finally block around the file operations to ensure cleanup, or use a using statement with a custom disposable wrapper.

Suggested change
ZipFile.ExtractToDirectory(zipFile, _ocrDir, true);
foreach (var requiredFile in OcrRequiredFiles)
{
var targetPath = Path.Combine(_ocrDir, requiredFile);
if (File.Exists(targetPath)) continue;
var foundFiles = Directory.GetFiles(_ocrDir, requiredFile, SearchOption.AllDirectories);
if (foundFiles.Length > 0)
File.Move(foundFiles[0], targetPath);
}
foreach (var dir in Directory.GetDirectories(_ocrDir))
Directory.Delete(dir, true);
File.Delete(zipFile);
try
{
ZipFile.ExtractToDirectory(zipFile, _ocrDir, true);
foreach (var requiredFile in OcrRequiredFiles)
{
var targetPath = Path.Combine(_ocrDir, requiredFile);
if (File.Exists(targetPath)) continue;
var foundFiles = Directory.GetFiles(_ocrDir, requiredFile, SearchOption.AllDirectories);
if (foundFiles.Length > 0)
File.Move(foundFiles[0], targetPath);
}
foreach (var dir in Directory.GetDirectories(_ocrDir))
Directory.Delete(dir, true);
}
finally
{
if (File.Exists(zipFile))
File.Delete(zipFile);
}

Copilot uses AI. Check for mistakes.
if (window is not null) _windowHelper.SetCloaked(window, false);

// Capture screenshot
var bitmap = await element.CaptureAsync(CancellationToken.None);
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bitmap is assigned to CaptureImage.Source at line 166, which means it's being reused. This is the same bitmap that was just passed to the OCR service and had its data read. However, there's no check if a previous bitmap exists in CaptureImage.Source before replacing it. If CaptureImage.Source contains a bitmap from a previous operation, it could cause a memory leak. Consider disposing the old bitmap before assigning the new one, similar to how it's done at line 141 where CaptureImage.Source is set to null on error.

Suggested change
var bitmap = await element.CaptureAsync(CancellationToken.None);
var bitmap = await element.CaptureAsync(CancellationToken.None);
// Dispose previous image source, if any, to avoid leaking bitmaps
if (CaptureImage.Source is IDisposable oldSource)
{
CaptureImage.Source = null;
oldSource.Dispose();
}

Copilot uses AI. Check for mistakes.
}

foreach (var dir in Directory.GetDirectories(_ocrDir))
Directory.Delete(dir, true);
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If any subdirectory deletion fails (due to file locks, permissions, etc.), the entire operation will fail silently in the catch block at line 159. This could leave the OCR directory in an inconsistent state with partial extraction results. Consider handling directory deletion failures more gracefully, or at least logging which specific operation failed.

Suggested change
Directory.Delete(dir, true);
{
try
{
Directory.Delete(dir, true);
}
catch (Exception ex)
{
// Log and continue: a failure deleting a subdirectory should not abort the entire operation.
Console.Error.WriteLine($"[MaaOcrService] Failed to delete directory '{dir}': {ex}");
}
}

Copilot uses AI. Check for mistakes.
Comment on lines 136 to 156
progress?.Report((double)downloadedBytes / totalBytes * 0.9);
}
}

ZipFile.ExtractToDirectory(zipFile, _ocrDir, true);

foreach (var requiredFile in OcrRequiredFiles)
{
var targetPath = Path.Combine(_ocrDir, requiredFile);
if (File.Exists(targetPath)) continue;

var foundFiles = Directory.GetFiles(_ocrDir, requiredFile, SearchOption.AllDirectories);
if (foundFiles.Length > 0)
File.Move(foundFiles[0], targetPath);
}

foreach (var dir in Directory.GetDirectories(_ocrDir))
Directory.Delete(dir, true);

File.Delete(zipFile);
progress?.Report(1.0);
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The progress reporting multiplies by 0.9, suggesting the download is 90% of the total progress. However, after the download completes, progress is reported as 1.0 at line 156 without accounting for the extraction and file manipulation steps in between. This means progress jumps from 0.9 to 1.0, and the extraction/cleanup operations (lines 140-155) don't report any progress. Consider reporting progress for the extraction phase to provide smoother progress updates.

Copilot uses AI. Check for mistakes.
Comment on lines +116 to +117
using var httpClient = new HttpClient();
httpClient.Timeout = TimeSpan.FromMinutes(10);
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating a new HttpClient instance directly violates the established pattern in this codebase. The codebase consistently uses IHttpClientFactory for creating HttpClient instances (see KernelMixinFactory, ChatPluginManager, and WebBrowserPlugin). Creating HttpClient directly can lead to socket exhaustion under heavy load. The MaaOcrService should accept IHttpClientFactory as a constructor dependency and use it to create the HttpClient instance.

Copilot uses AI. Check for mistakes.
Comment on lines 34 to 59
public async Task<bool> EnsureModelAsync(IProgress<double>? progress = null, CancellationToken cancellationToken = default)
{
await _initLock.WaitAsync(cancellationToken);
try
{
if (_isInitialized) return true;

if (!CheckOcrFilesExist())
{
var success = await DownloadAndExtractOcrAsync(progress, cancellationToken);
if (!success) return false;
}
else
{
progress?.Report(1.0);
}

EnsurePipelineDir();
InitializeResource();
_isInitialized = true;
return true;
}
finally
{
_initLock.Release();
}
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a potential race condition: if Dispose() is called while EnsureModelAsync() is running, the _initLock could be disposed while a thread is waiting on it, causing an ObjectDisposedException. The disposal should wait for any in-progress initialization to complete before disposing the semaphore. Consider checking _disposed before calling _initLock.WaitAsync, or use a lock to coordinate between disposal and initialization.

Copilot uses AI. Check for mistakes.

// Run OCR in background
var sw = Stopwatch.StartNew();
var result = await _ocrService.RecognizeAsync(bitmap, CancellationToken.None);
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using CancellationToken.None means the OCR operation cannot be cancelled once started. If the user closes the window or wants to cancel the operation, it will continue running in the background. Consider creating a CancellationTokenSource that gets cancelled when the control is unloaded or providing a way for users to cancel long-running OCR operations.

Copilot uses AI. Check for mistakes.
public async Task<OcrResult?> RecognizeAsync(Bitmap image, CancellationToken cancellationToken = default)
{
using var stream = new MemoryStream();
image.Save(stream);
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bitmap.Save(stream) call is missing the quality parameter. Looking at consistent usage throughout the codebase (VisualTreePlugin.cs:86 and ChatWindowViewModel.cs:529), bitmap.Save should be called with two parameters: bitmap.Save(stream, 100), where 100 is the quality parameter.

Suggested change
image.Save(stream);
image.Save(stream, 100);

Copilot uses AI. Check for mistakes.
Comment on lines +159 to +162
catch
{
return false;
}
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Swallowing all exceptions and returning false makes debugging difficult and hides important errors from users. The download could fail for various reasons (network issues, disk full, permission denied, invalid ZIP format), and users need to know why it failed. Consider logging the exception before returning false, or at minimum catching specific expected exceptions separately from unexpected ones.

Copilot uses AI. Check for mistakes.
Comment on lines +244 to +249
foreach (var item in prop.EnumerateArray())
{
var result = ParseSingleResult(item);
if (result != null)
results.Add(result);
}
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This foreach loop immediately maps its iteration variable to another variable - consider mapping the sequence explicitly using '.Select(...)'.

Copilot uses AI. Check for mistakes.
private async Task<bool> DownloadAndExtractOcrAsync(IProgress<double>? progress, CancellationToken cancellationToken)
{
Directory.CreateDirectory(_ocrDir);
var modelDir = Path.GetDirectoryName(_ocrDir)!;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感叹号洁癖:

ArgumentNullException.ThrowIfNull(modelDir);

Comment on lines 123 to 138
await using (var contentStream = await response.Content.ReadAsStreamAsync(cancellationToken))
await using (var fileStream = new FileStream(zipFile, FileMode.Create, FileAccess.Write, FileShare.None, 8192, true))
{
var buffer = new byte[65536];
long downloadedBytes = 0;
int bytesRead;

while ((bytesRead = await contentStream.ReadAsync(buffer, cancellationToken)) > 0)
{
await fileStream.WriteAsync(buffer.AsMemory(0, bytesRead), cancellationToken);
downloadedBytes += bytesRead;

if (totalBytes > 0)
progress?.Report((double)downloadedBytes / totalBytes * 0.9);
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • 使用using var语法,减少大括号
  • 如果totalBytes已知,可以预分配
if (totalBytes > 0)
{
    fileStream.SetLength(totalBytes);
}
  • buffer size可以用81920,考虑到GC和对象分配问题,参见

public sealed class MaaOcrService : IOcrService, IDisposable
{
private const string OcrDownloadUrl = "https://download.maafw.xyz/MaaCommonAssets/OCR/ppocr_v5/ppocr_v5-zh_cn.zip";
private static readonly string[] OcrRequiredFiles = ["det.onnx", "keys.txt", "rec.onnx"];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

使用ReadOnlySpan<string>减少堆分配(虽然不知道现在编译器会不会自动优化这个部分orz

var targetPath = Path.Combine(_ocrDir, requiredFile);
if (File.Exists(targetPath)) continue;

var foundFiles = Directory.GetFiles(_ocrDir, requiredFile, SearchOption.AllDirectories);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

看了下zip里面的文件都是在根的,考虑SearchOption.TopDirectoryOnly

Comment on lines 26 to 32
public MaaOcrService()
{
_bundleDir = Path.Combine(
Environment.GetFolderPath(Environment.SpecialFolder.ApplicationData),
"Everywhere", "MaaBundle");
_ocrDir = Path.Combine(_bundleDir, "model", "ocr");
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

emmm,这俩fields似乎只在这里set了,直接rua到定义后面赋值)

Comment on lines 36 to 39
await _initLock.WaitAsync(cancellationToken);
try
{
if (_isInitialized) return true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • 使用ValueTask
  • 把检查isInitialized复制放到waitasync上面,零分配
  • 然后把waitasync加configureawait减少线程切换,下面双重检查
if (_isInitialized) return true;
await _initLock.WaitAsync(cancellationToken).ConfigureAwait(false);
try
{
    if (_isInitialized) return true;
...


private bool CheckOcrFilesExist()
{
if (!Directory.Exists(_ocrDir)) return false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

以防万一,可以加入文件大小校验(至少非空什么的?)或者上面copilot提到的SHA256防止用户下载炸了

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants