Google Vision API
Google Cloud Platform provides a bunch of pre-trained machine learning models for text and object detection through APIs.
The Vision API can detect text, objects and label images, etc.
To give it a try:
Sign up with google cloud platform, fill in address and credit card info (which creates a billing account)
In google clould platform console, create an empty project
Link the billing account to the project through Billing on the left task bar
Go to APIs&Services, and enable the Vision API with the project
Host you images on google cloud storage in a bucket, and make the images public
Call the api through the web plug in at https://cloud.google.com/vision/docs/quickstart
Modify the following rest service request. The imageuri points to your image on google cloud storage / any other public url.
{
"requests": [
{
"features": [
{
"type": "LABEL_DETECTION"
}
],
"image": {
"source": {
"imageUri": "gs://bucket-name-123/demo-image.jpg"
}
}
}
]
}
The type can be LABEL_DETECTION, DOCUMENT_TEXT_DETECTION, OBJECT_LOCALIZATION etc. for labeling, detecting text, detecting objects and locations.
If using a Rest client to call the api, firstly generate an API key for authentication. https://cloud.google.com/docs/authentication/api-keys
Then post the rest service request to
POST https://language.googleapis.com/v1/images:annotate?key=API_KEY
The images:annotate is the name of the api for reading text from an image. Change the name if using other apis.
The image doesn't have to be an url. It can be the encoded image content.
There are libraries available in Python, C#, Java, etc.
Insall the library locally and call the API programmetically. Snippet in python:
client = vision.ImageAnnotatorClient()
with io.open(image_file, 'rb') as image_file:
content = image_file.read()
image = types.Image(content=content)
response = client.document_text_detection(image=image)
document = response.full_text_annotation