-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
Determine this is the right repository
- I determined this is the correct repository in which to report this bug.
Summary of the issue
Context
I am configuring a dataset for a Custom Classifier processor using the
google-cloud-documentai v1beta3 Python client (DocumentServiceClient.update_dataset).
The underlying REST PATCH call to
projects/{project}/locations/{location}/processors/{processor}/dataset
succeeds and the dataset is correctly attached to the processor (visible in
the Console UI). However, the Python long‑running operation helper raises:
GoogleAPICallError('Unexpected state: Long-running operation had neither response nor error set.')
Expected Behavior
operation.result() on client.update_dataset(...) should complete without
raising an exception when the backend successfully updates the dataset, and
the returned Operation should have either a response or an error set.
Actual Behavior
The dataset is attached correctly (Console shows the dataset GCS URI and the
dataset works for later training), but operation.result() raises
GoogleAPICallError: None Unexpected state: Long-running operation had neither response nor error set.
This makes it difficult to distinguish real failures from this client‑side LRO quirk.
API client name and version
google-cloud-documentai==3.7.0, google-api-core==2.28.1
Reproduction steps: code
file: setup.py
from google.cloud import documentai_v1beta3 as documentai
from google.api_core.client_options import ClientOptions
# Your settings
PROJECT_ID = "PROJECT_ID"
LOCATION = "LOCATION"
PROCESSOR_ID = "PROCESSOR_ID"
gcs_uri_prefix = "GCS_URI"
# Setup the endpoint correctly
client_options = ClientOptions(api_endpoint=f"{LOCATION}-documentai.googleapis.com")
# Initialize the Document AI client
client = documentai.DocumentServiceAsyncClient(client_options=client_options)
# Build the full dataset resource name
dataset_name = f"projects/{PROJECT_ID}/locations/{LOCATION}/processors/{PROCESSOR_ID}/dataset"
# Prepare the dataset configuration
dataset = documentai.Dataset(
name=dataset_name,
gcs_managed_config=documentai.Dataset.GCSManagedConfig(
gcs_prefix=documentai.GcsPrefix(
gcs_uri_prefix=gcs_uri_prefix
)
),
spanner_indexing_config=documentai.Dataset.SpannerIndexingConfig()
)
# Prepare the update request
update_request = documentai.UpdateDatasetRequest(
dataset=dataset
)
# Call the update_dataset API
operation = client.update_dataset(request=update_request)
response = operation.result()Reproduction steps: supporting files
Reproduction steps: actual results
Log output:
Traceback:
File "setup.py", line 128, in add_processor_dataset
response = operation.result(timeout=600)
File ".../google/api_core/future/polling.py", line 261, in result
raise self._exception
google.api_core.exceptions.GoogleAPICallError: None Unexpected state:
Long-running operation had neither response nor error set.
Despite this, the Console shows the dataset configured correctly under the
processor.
Reproduction steps: expected results
operation.result() should either:
- Return a
Dataset(or empty response) when the update succeeds, or - Raise an error with details from the backend (e.g. permission denied, invalid name).
It should not raise GoogleAPICallError: ... neither response nor error set
when the dataset is successfully attached and visible in the UI.
OS & version + platform
Base image: python:3.11-slim (Debian-based) Platform: Cloud Run job (containerized)
Python environment
Python 3.11.x
Python dependencies
Package Version Editable project location
aiohappyeyeballs 2.6.1
aiohttp 3.13.3
aiosignal 1.4.0
annotated-types 0.7.0
async-timeout 5.0.1
asyncpg 0.31.0
attrs 25.4.0
backports-asyncio-runner 1.2.0
bottleneck 1.6.0
certifi 2026.1.4
cffi 2.0.0
charset-normalizer 3.4.4
cryptography 46.0.4
db-dtypes 1.5.0
decorator 5.2.1
deprecated 1.3.1
exceptiongroup 1.3.1
frozenlist 1.8.0
fsspec 2026.2.0
gcsfs 2026.2.0
google-api-core 2.29.0
google-api-python-client 2.189.0
google-auth 2.48.0
google-auth-httplib2 0.3.0
google-auth-oauthlib 1.2.4
google-cloud-bigquery 3.40.0
google-cloud-core 2.5.0
google-cloud-documentai 3.9.0
google-cloud-documentai-toolbox 0.15.0a0
google-cloud-secret-manager 2.26.0
google-cloud-storage 3.9.0
google-cloud-storage-control 1.9.0
google-cloud-vision 3.12.1
google-crc32c 1.8.0
google-resumable-media 2.8.0
googleapis-common-protos 1.72.0
greenlet 3.3.1
grpc-google-iam-v1 0.14.3
grpcio 1.78.0
grpcio-status 1.78.0
httplib2 0.31.2
idna 3.11
immutabledict 4.2.2
iniconfig 2.3.0
intervaltree 3.2.1
jinja2 3.1.6
joblib 1.5.3
llvmlite 0.46.0
lxml 6.0.2
markupsafe 3.0.3
multidict 6.7.1
numba 0.63.1
numexpr 2.14.1
numpy 2.2.6
oauthlib 3.3.1
packaging 26.0
pandas 2.3.3
pandas-gbq 0.33.0
pikepdf 10.3.0
pillow 11.3.0
pluggy 1.6.0
propcache 0.4.1
proto-plus 1.27.1
protobuf 6.33.5
psutil 7.2.2
pyarrow 22.0.0
pyasn1 0.6.2
pyasn1-modules 0.4.2
pycparser 3.0
pydantic 2.12.5
pydantic-core 2.41.5
pydantic-settings 2.12.0
pydata-google-auth 1.9.1
pygments 2.19.2
pyparsing 3.3.2
pytest 9.0.2
pytest-asyncio 1.3.0
python-dateutil 2.9.0.post0
python-dotenv 1.2.1
pytz 2025.2
requests 2.32.5
requests-oauthlib 2.0.0
rsa 4.9.1
scikit-learn 1.7.2
scipy 1.15.3
setuptools 82.0.0
six 1.17.0
sortedcontainers 2.4.0
sqlalchemy 2.0.46
tabulate 0.9.0
threadpoolctl 3.6.0
tifffile 2025.5.10
tomli 2.4.0
tqdm 4.67.3
typing-extensions 4.15.0
typing-inspection 0.4.2
tzdata 2025.3
uritemplate 4.2.0
urllib3 2.6.3
wrapt 2.1.1
yarl 1.22.0
Additional context
If I call the REST API directly with:
curl -X PATCH \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://eu-documentai.googleapis.com/v1beta3/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID/dataset"
and the JSON body:
{
"name": "projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID/dataset",
"gcs_managed_config": {
"gcs_prefix": {
"gcs_uri_prefix": "gs://..."
}
},
"spanner_indexing_config": {}
}
the operation succeeds and the dataset is attached, with no error.
Only the Python client’s operation.result() reports the invalid LRO state.