Video annotation progress. Included in the metadata
field of the Operation
returned by the GetOperation
call of the google::longrunning::Operations
service.
Video annotation request.
Video annotation response. Included in the response
field of the Operation
returned by the GetOperation
call of the google::longrunning::Operations
service.
Celebrity definition.
Celebrity recognition annotation per video.
The annotation result of a celebrity face track. RecognizedCelebrity field
could be empty if the face track does not have any matched celebrities.
A generic detected attribute represented by name in string format.
A generic detected landmark represented by name in string format and a 2D
location.
Detected entity from video analysis.
Explicit content annotation (based on per-frame visual signals only).
If no explicit content has been detected in a frame, no annotations are
present for that frame.
Config for EXPLICIT_CONTENT_DETECTION.
Video frame level annotation results for explicit content.
Face detection annotation.
Config for FACE_DETECTION.
Label annotation.
Config for LABEL_DETECTION.
Video frame level annotation results for label detection.
Video segment level annotation results for label detection.
Annotation corresponding to one detected, tracked and recognized logo class.
Normalized bounding box.
The normalized vertex coordinates are relative to the original image.
Range: [0, 1].
Normalized bounding polygon for text (that might not be aligned with axis).
Contains list of the corner points in clockwise order starting from
top-left corner. For example, for a rectangular bounding box:
When the text is horizontal it might look like:
0––1
| |
3––2
A vertex represents a 2D point in the image.
NOTE: the normalized vertex coordinates are relative to the original image
and range from 0 to 1.
Annotations corresponding to one tracked object.
Config for OBJECT_TRACKING.
Video frame level annotations for object detection and tracking. This field
stores per frame location, time offset, and confidence.
Person detection annotation per video.
Config for PERSON_DETECTION.
Config for SHOT_CHANGE_DETECTION.
Provides “hints” to the speech recognizer to favor specific words and phrases
in the results.
Alternative hypotheses (a.k.a. n-best list).
A speech recognition result corresponding to a portion of the audio.
Config for SPEECH_TRANSCRIPTION.
The top-level message sent by the client for the StreamingAnnotateVideo
method. Multiple StreamingAnnotateVideoRequest
messages are sent.
The first message must only contain a StreamingVideoConfig
message.
All subsequent messages must only contain input_content
data.
StreamingAnnotateVideoResponse
is the only message returned to the client
by StreamingAnnotateVideo
. A series of zero or more
StreamingAnnotateVideoResponse
messages are streamed back to the client.
Config for STREAMING_AUTOML_ACTION_RECOGNITION.
Config for STREAMING_AUTOML_CLASSIFICATION.
Config for STREAMING_AUTOML_OBJECT_TRACKING.
Config for STREAMING_EXPLICIT_CONTENT_DETECTION.
Config for STREAMING_LABEL_DETECTION.
Config for STREAMING_OBJECT_TRACKING.
Config for STREAMING_SHOT_CHANGE_DETECTION.
Config for streaming storage option.
Streaming annotation results corresponding to a portion of the video
that is currently being processed.
Provides information to the annotator that specifies how to process the
request.
Annotations related to one detected OCR text snippet. This will contain the
corresponding text, confidence value, and frame level information for each
detection.
Config for TEXT_DETECTION.
Video frame level annotation results for text annotation (OCR).
Contains information regarding timestamp and bounding box locations for the
frames containing detected OCR text snippets.
Video segment level annotation results for text detection.
For tracking related features.
An object at time_offset with attributes, and located with
normalized_bounding_box.
A track of an object instance.
Annotation progress for a single video.
Annotation results for a single video.
Video context and/or feature-specific parameters.
Video segment.
Word-specific information for recognized words. Word information is only
included in the response when certain request parameters are set, such
as enable_word_time_offsets
.