diff --git a/docs/ocr/umn/ALL_META.TXT.json b/docs/ocr/umn/ALL_META.TXT.json new file mode 100644 index 00000000..0640a43c --- /dev/null +++ b/docs/ocr/umn/ALL_META.TXT.json @@ -0,0 +1,635 @@ +[ + { + "dockw":"User Guide" + }, + { + "uri":"ocr_01_0002.html", + "node_id":"ocr_01_0002.xml", + "product_code":"ocr", + "code":"1", + "des":"Optical Character Recognition (OCR) detects and extracts text from images and converts the text into an editable JSON format.OCR provides open APIs, so you can use progra", + "doc_type":"usermanual", + "kw":"What Is OCR?,User Guide", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"What Is OCR?", + "githuburl":"" + }, + { + "uri":"ocr_01_0028.html", + "node_id":"ocr_01_0028.xml", + "product_code":"ocr", + "code":"2", + "des":"General TableDetects and extracts text and their row and column locations from images of tables in various formats, as well as the text areas outside tables. It is used t", + "doc_type":"usermanual", + "kw":"Functions,User Guide", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"Functions", + "githuburl":"" + }, + { + "uri":"ocr_01_0006.html", + "node_id":"ocr_01_0006.xml", + "product_code":"ocr", + "code":"3", + "des":"There are various factors, such as technology and cost, that limit the performance of OCR services. The system-level constraints are the most significant limitations that", + "doc_type":"usermanual", + "kw":"Constraints and Limitations,User Guide", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"Constraints and Limitations", + "githuburl":"" + }, + { + "uri":"ocr_01_0024.html", + "node_id":"ocr_01_0024.xml", + "product_code":"ocr", + "code":"4", + "des":"Identity and Access Management (IAM) lets you control user authentication and access to OCR.Object Storage Service (OBS) is a stable, secure, efficient, and easy-to-use c", + "doc_type":"usermanual", + "kw":"Related Services,User Guide", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"Related Services", + "githuburl":"" + }, + { + "uri":"ocr_01_0050.html", + "node_id":"ocr_01_0050.xml", + "product_code":"ocr", + "code":"5", + "des":"Log in to the OCR console and choose Service Monitoring. On the Service Monitoring page displayed, view the total number of API calls, number of successful API calls, and", + "doc_type":"usermanual", + "kw":"Monitoring,User Guide", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"Monitoring", + "githuburl":"" + }, + { + "uri":"ocr_01_0153.html", + "node_id":"ocr_01_0153.xml", + "product_code":"ocr", + "code":"6", + "des":"Use Postman to call the General Text OCR API.To call an OCR API, perform the following steps:Step 1: Subscribing to a ServiceStep 2: Configuring the EnvironmentStep 3: Us", + "doc_type":"usermanual", + "kw":"Getting Started,User Guide", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"Getting Started", + "githuburl":"" + }, + { + "uri":"ocr_01_0015.html", + "node_id":"ocr_01_0015.xml", + "product_code":"ocr", + "code":"7", + "des":"HUAWEI CLOUD Help Center presents technical documents to help you quickly get started with HUAWEI CLOUD services. The technical documents include Service Overview, Price Details, Purchase Guide, User Guide, API Reference, Best Practices, FAQs, and Videos.", + "doc_type":"usermanual", + "kw":"FAQ", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"FAQ", + "githuburl":"" + }, + { + "uri":"ocr_01_0109.html", + "node_id":"ocr_01_0109.xml", + "product_code":"ocr", + "code":"8", + "des":"HUAWEI CLOUD Help Center presents technical documents to help you quickly get started with HUAWEI CLOUD services. The technical documents include Service Overview, Price Details, Purchase Guide, User Guide, API Reference, Best Practices, FAQs, and Videos.", + "doc_type":"usermanual", + "kw":"Token-based Authentication", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"Token-based Authentication", + "githuburl":"" + }, + { + "uri":"ocr_01_0062.html", + "node_id":"ocr_01_0062.xml", + "product_code":"ocr", + "code":"9", + "des":"username indicates the name of the user, and domainname indicates the name of the account to which the user belongs.If the token is obtained by an account, user name and ", + "doc_type":"usermanual", + "kw":"What Are the Username, Domain Name, and Project Name in the Token Message Body?,Token-based Authenti", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"What Are the Username, Domain Name, and Project Name in the Token Message Body?", + "githuburl":"" + }, + { + "uri":"ocr_01_0044.html", + "node_id":"ocr_01_0044.xml", + "product_code":"ocr", + "code":"10", + "des":"If you receive a 401 status code when calling the OCR service using a token, it means that the token has expired. The token is valid for 24 hours, so it is recommended to", + "doc_type":"usermanual", + "kw":"Why Do I Receive a 401 Status Code Even After Successfully Obtaining a Token?,Token-based Authentica", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"Why Do I Receive a 401 Status Code Even After Successfully Obtaining a Token?", + "githuburl":"" + }, + { + "uri":"ocr_01_0040.html", + "node_id":"ocr_01_0040.xml", + "product_code":"ocr", + "code":"11", + "des":"When obtaining a token, refer to the error message and select the appropriate solution.Check whether the service region in the body and the corresponding key value are co", + "doc_type":"usermanual", + "kw":"Why Am I Experiencing Token Retrieval Failure When Calling an OCR API Using Postman?,Token-based Aut", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"Why Am I Experiencing Token Retrieval Failure When Calling an OCR API Using Postman?", + "githuburl":"" + }, + { + "uri":"ocr_01_0120.html", + "node_id":"ocr_01_0120.xml", + "product_code":"ocr", + "code":"12", + "des":"When calling an API, if you receive the error message \"code\":403,\"message\":\"This user only supports console access, not programmatic access.\", it means that you do not ha", + "doc_type":"usermanual", + "kw":"Why Don't I Have Programming Access Permission?,Token-based Authentication,User Guide", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"Why Don't I Have Programming Access Permission?", + "githuburl":"" + }, + { + "uri":"ocr_01_0133.html", + "node_id":"ocr_01_0133.xml", + "product_code":"ocr", + "code":"13", + "des":"If the error message \"Error: Header overflow\" is displayed when calling an API using Postman to obtain an authentication token, it means that the header has exceeded its ", + "doc_type":"usermanual", + "kw":"Why Is the Header Overflowing When I Call an OCR API Using Postman?,Token-based Authentication,User ", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"Why Is the Header Overflowing When I Call an OCR API Using Postman?", + "githuburl":"" + }, + { + "uri":"ocr_01_0046.html", + "node_id":"ocr_01_0046.xml", + "product_code":"ocr", + "code":"14", + "des":"HUAWEI CLOUD Help Center presents technical documents to help you quickly get started with HUAWEI CLOUD services. The technical documents include Service Overview, Price Details, Purchase Guide, User Guide, API Reference, Best Practices, FAQs, and Videos.", + "doc_type":"usermanual", + "kw":"APIs", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"APIs", + "githuburl":"" + }, + { + "uri":"ocr_01_0056.html", + "node_id":"ocr_01_0056.xml", + "product_code":"ocr", + "code":"15", + "des":"Log in to the OCR console and choose Service Monitoring. On the Service Monitoring page displayed, view the total number of API calls, number of successful API calls, and", + "doc_type":"usermanual", + "kw":"How Do I View the API Usage?,APIs,User Guide", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"How Do I View the API Usage?", + "githuburl":"" + }, + { + "uri":"ocr_01_0102.html", + "node_id":"ocr_01_0102.xml", + "product_code":"ocr", + "code":"16", + "des":"No. It is not recommended to use one OCR API for a different function as each API has its own unique algorithms. Doing so may result in poor recognition results. For inst", + "doc_type":"usermanual", + "kw":"Can I Use an OCR API for a Different Function?,APIs,User Guide", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"Can I Use an OCR API for a Different Function?", + "githuburl":"" + }, + { + "uri":"ocr_01_0093.html", + "node_id":"ocr_01_0093.xml", + "product_code":"ocr", + "code":"17", + "des":"At present, the OCR API can only recognize one image per call. To process multiple images in batches, you would need to write additional code that allows for repeated cal", + "doc_type":"usermanual", + "kw":"Does OCR Support Batch Processing?,APIs,User Guide", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"Does OCR Support Batch Processing?", + "githuburl":"" + }, + { + "uri":"ocr_01_0033.html", + "node_id":"ocr_01_0033.xml", + "product_code":"ocr", + "code":"18", + "des":"OCR results are initially in JSON format, but they can be easily saved as Word, TXT, or PDF files.", + "doc_type":"usermanual", + "kw":"Is It Possible to Convert OCR Results into Word, TXT, or PDF Files?,APIs,User Guide", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"Is It Possible to Convert OCR Results into Word, TXT, or PDF Files?", + "githuburl":"" + }, + { + "uri":"ocr_01_0032.html", + "node_id":"ocr_01_0032.xml", + "product_code":"ocr", + "code":"19", + "des":"To recognize an image, it must first be converted to its Base64 coding. This section provides an example using Python to explain how to convert a local image to Base64 co", + "doc_type":"usermanual", + "kw":"How Can I Get the Base64 Encoding of an Image?,APIs,User Guide", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"How Can I Get the Base64 Encoding of an Image?", + "githuburl":"" + }, + { + "uri":"ocr_01_0078.html", + "node_id":"ocr_01_0078.xml", + "product_code":"ocr", + "code":"20", + "des":"Cross-region OBS is not supported, and the OBS region must match the region of the service being called.For OBS resources with public read authorization, they can be acce", + "doc_type":"usermanual", + "kw":"Is It Possible to Call the OCR Service From a Different Region Than OBS Resources?,APIs,User Guide", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"Is It Possible to Call the OCR Service From a Different Region Than OBS Resources?", + "githuburl":"" + }, + { + "uri":"ocr_01_0122.html", + "node_id":"ocr_01_0122.xml", + "product_code":"ocr", + "code":"21", + "des":"Use high-quality images with clear text and no glare as much as possible. When collecting images, try to increase the proportion of the area with text to be recognized, r", + "doc_type":"usermanual", + "kw":"How Can I Improve Recognition Accuracy?,APIs,User Guide", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"How Can I Improve Recognition Accuracy?", + "githuburl":"" + }, + { + "uri":"ocr_01_0123.html", + "node_id":"ocr_01_0123.xml", + "product_code":"ocr", + "code":"22", + "des":"The recognition speed is related to the image size, which affects the time required for network transmission and image Base64 decoding. To speed up image processing by th", + "doc_type":"usermanual", + "kw":"How Can I Improve Recognition Speed?,APIs,User Guide", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"How Can I Improve Recognition Speed?", + "githuburl":"" + }, + { + "uri":"ocr_01_0125.html", + "node_id":"ocr_01_0125.xml", + "product_code":"ocr", + "code":"23", + "des":"HUAWEI CLOUD Help Center presents technical documents to help you quickly get started with HUAWEI CLOUD services. The technical documents include Service Overview, Price Details, Purchase Guide, User Guide, API Reference, Best Practices, FAQs, and Videos.", + "doc_type":"usermanual", + "kw":"How Do I Convert a PDF to an Image?,APIs,User Guide", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"How Do I Convert a PDF to an Image?", + "githuburl":"" + }, + { + "uri":"ocr_01_0076.html", + "node_id":"ocr_01_0076.xml", + "product_code":"ocr", + "code":"24", + "des":"HUAWEI CLOUD Help Center presents technical documents to help you quickly get started with HUAWEI CLOUD services. The technical documents include Service Overview, Price Details, Purchase Guide, User Guide, API Reference, Best Practices, FAQs, and Videos.", + "doc_type":"usermanual", + "kw":"Error Codes", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"Error Codes", + "githuburl":"" + }, + { + "uri":"ocr_01_0017.html", + "node_id":"ocr_01_0017.xml", + "product_code":"ocr", + "code":"25", + "des":"Locate the cause based on the API calling result or error code.Check whether you have subscribed to the service.Check whether the token is correct or has expired.Check wh", + "doc_type":"usermanual", + "kw":"Why Am I Unable to Call an OCR API?,Error Codes,User Guide", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"Why Am I Unable to Call an OCR API?", + "githuburl":"" + }, + { + "uri":"ocr_01_0063.html", + "node_id":"ocr_01_0063.xml", + "product_code":"ocr", + "code":"26", + "des":"If an error message and error code are returned when an API is called:If error message \"Incorrect IAM authentication information: decrypt token fail\" and error code \"APIG", + "doc_type":"usermanual", + "kw":"How Do I Handle the Error APIG.0301?,Error Codes,User Guide", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"How Do I Handle the Error APIG.0301?", + "githuburl":"" + }, + { + "uri":"ocr_01_0065.html", + "node_id":"ocr_01_0065.xml", + "product_code":"ocr", + "code":"27", + "des":"If error message \"Backend timeout.\" and error code \"APIG.0201\" are displayed when you call an OCR API, the request timed out.Perform the following steps to rectify the fa", + "doc_type":"usermanual", + "kw":"How Do I Handle the Error APIG.0201?,Error Codes,User Guide", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"How Do I Handle the Error APIG.0201?", + "githuburl":"" + }, + { + "uri":"ocr_01_0121.html", + "node_id":"ocr_01_0121.xml", + "product_code":"ocr", + "code":"28", + "des":"The error message \"The API does not exist or has not been published in the environment.\" is displayed.Check whether the API URL is correct. For example, check whether the", + "doc_type":"usermanual", + "kw":"How Do I Handle the Error APIG.0101?,Error Codes,User Guide", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"How Do I Handle the Error APIG.0101?", + "githuburl":"" + }, + { + "uri":"ocr_01_0137.html", + "node_id":"ocr_01_0137.xml", + "product_code":"ocr", + "code":"29", + "des":"If error message and error code \"error_msg\":\"Orchestration error.\",\"error_code\":\"APIG.0106\" are returned when an API is called, check whether the frontend and backend par", + "doc_type":"usermanual", + "kw":"How Do I Handle the Error APIG.0106?,Error Codes,User Guide", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"How Do I Handle the Error APIG.0106?", + "githuburl":"" + }, + { + "uri":"ocr_01_0041.html", + "node_id":"ocr_01_0041.xml", + "product_code":"ocr", + "code":"30", + "des":"The error message \"The throttling threshold has been reached: policy user over ratelimit,limit:XX,time:1 minute\" and error code \"APIG.0308\" are displayed when you call an", + "doc_type":"usermanual", + "kw":"How Do I Handle the Error APIG.0308?,Error Codes,User Guide", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"How Do I Handle the Error APIG.0308?", + "githuburl":"" + }, + { + "uri":"ocr_01_0089.html", + "node_id":"ocr_01_0089.xml", + "product_code":"ocr", + "code":"31", + "des":"If error message \"The token must be updated.\" and error code \"APIG.0307\" are displayed when you call an OCR API, the token has expired and needs to be updated.Perform the", + "doc_type":"usermanual", + "kw":"How Do I Handle the Error APIG.0307?,Error Codes,User Guide", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"How Do I Handle the Error APIG.0307?", + "githuburl":"" + }, + { + "uri":"ocr_01_0103.html", + "node_id":"ocr_01_0103.xml", + "product_code":"ocr", + "code":"32", + "des":"If error message \"Invalid Token header. The Token not contain project item.\" and error code \"ModelArts.0212\" are displayed when you call an OCR API, the token is invalid ", + "doc_type":"usermanual", + "kw":"How Do I Handle the Error ModelArts.0212?,Error Codes,User Guide", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"How Do I Handle the Error ModelArts.0212?", + "githuburl":"" + }, + { + "uri":"ocr_01_0064.html", + "node_id":"ocr_01_0064.xml", + "product_code":"ocr", + "code":"33", + "des":"If \"error_code\":\"ModelArts.4603\",\"error_msg\":\"Obtaining the file from the URL failed.\" is displayed, it indicates that the image file fails to be obtained from the URL. T", + "doc_type":"usermanual", + "kw":"How Do I Handle the Error ModelArts.4603 or ModelArts.4704?,Error Codes,User Guide", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"How Do I Handle the Error ModelArts.4603 or ModelArts.4704?", + "githuburl":"" + }, + { + "uri":"ocr_01_0031.html", + "node_id":"ocr_01_0031.xml", + "product_code":"ocr", + "code":"34", + "des":"If \"\"error_code\":\"ModelArts.4204\",\"error_msg\":\"Request api error! The API XXX is not subscribed.\" is displayed when you call an API, it indicates that you have not subscr", + "doc_type":"usermanual", + "kw":"How Do I Handle the Error ModelArts.4204?,Error Codes,User Guide", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"How Do I Handle the Error ModelArts.4204?", + "githuburl":"" + }, + { + "uri":"ocr_01_0096.html", + "node_id":"ocr_01_0096.xml", + "product_code":"ocr", + "code":"35", + "des":"When calling an OCR API, the following error codes may occur related to image quality:Error code AIS.0102: Unsupported image format.Error code AIS.0103: Image size does n", + "doc_type":"usermanual", + "kw":"How Do I Handle Image Quality Errors?,Error Codes,User Guide", + "search_title":"", + "metedata":[ + { + "prodname":"ocr", + "documenttype":"usermanual" + } + ], + "title":"How Do I Handle Image Quality Errors?", + "githuburl":"" + } +] \ No newline at end of file diff --git a/docs/ocr/umn/CLASS.TXT.json b/docs/ocr/umn/CLASS.TXT.json new file mode 100644 index 00000000..1757c93c --- /dev/null +++ b/docs/ocr/umn/CLASS.TXT.json @@ -0,0 +1,317 @@ +[ + { + "desc":"Optical Character Recognition (OCR) detects and extracts text from images and converts the text into an editable JSON format.OCR provides open APIs, so you can use progra", + "product_code":"ocr", + "title":"What Is OCR?", + "uri":"ocr_01_0002.html", + "doc_type":"usermanual", + "p_code":"", + "code":"1" + }, + { + "desc":"General TableDetects and extracts text and their row and column locations from images of tables in various formats, as well as the text areas outside tables. It is used t", + "product_code":"ocr", + "title":"Functions", + "uri":"ocr_01_0028.html", + "doc_type":"usermanual", + "p_code":"", + "code":"2" + }, + { + "desc":"There are various factors, such as technology and cost, that limit the performance of OCR services. The system-level constraints are the most significant limitations that", + "product_code":"ocr", + "title":"Constraints and Limitations", + "uri":"ocr_01_0006.html", + "doc_type":"usermanual", + "p_code":"", + "code":"3" + }, + { + "desc":"Identity and Access Management (IAM) lets you control user authentication and access to OCR.Object Storage Service (OBS) is a stable, secure, efficient, and easy-to-use c", + "product_code":"ocr", + "title":"Related Services", + "uri":"ocr_01_0024.html", + "doc_type":"usermanual", + "p_code":"", + "code":"4" + }, + { + "desc":"Log in to the OCR console and choose Service Monitoring. On the Service Monitoring page displayed, view the total number of API calls, number of successful API calls, and", + "product_code":"ocr", + "title":"Monitoring", + "uri":"ocr_01_0050.html", + "doc_type":"usermanual", + "p_code":"", + "code":"5" + }, + { + "desc":"Use Postman to call the General Text OCR API.To call an OCR API, perform the following steps:Step 1: Subscribing to a ServiceStep 2: Configuring the EnvironmentStep 3: Us", + "product_code":"ocr", + "title":"Getting Started", + "uri":"ocr_01_0153.html", + "doc_type":"usermanual", + "p_code":"", + "code":"6" + }, + { + "desc":"HUAWEI CLOUD Help Center presents technical documents to help you quickly get started with HUAWEI CLOUD services. The technical documents include Service Overview, Price Details, Purchase Guide, User Guide, API Reference, Best Practices, FAQs, and Videos.", + "product_code":"ocr", + "title":"FAQ", + "uri":"ocr_01_0015.html", + "doc_type":"usermanual", + "p_code":"", + "code":"7" + }, + { + "desc":"HUAWEI CLOUD Help Center presents technical documents to help you quickly get started with HUAWEI CLOUD services. The technical documents include Service Overview, Price Details, Purchase Guide, User Guide, API Reference, Best Practices, FAQs, and Videos.", + "product_code":"ocr", + "title":"Token-based Authentication", + "uri":"ocr_01_0109.html", + "doc_type":"usermanual", + "p_code":"7", + "code":"8" + }, + { + "desc":"username indicates the name of the user, and domainname indicates the name of the account to which the user belongs.If the token is obtained by an account, user name and ", + "product_code":"ocr", + "title":"What Are the Username, Domain Name, and Project Name in the Token Message Body?", + "uri":"ocr_01_0062.html", + "doc_type":"usermanual", + "p_code":"8", + "code":"9" + }, + { + "desc":"If you receive a 401 status code when calling the OCR service using a token, it means that the token has expired. The token is valid for 24 hours, so it is recommended to", + "product_code":"ocr", + "title":"Why Do I Receive a 401 Status Code Even After Successfully Obtaining a Token?", + "uri":"ocr_01_0044.html", + "doc_type":"usermanual", + "p_code":"8", + "code":"10" + }, + { + "desc":"When obtaining a token, refer to the error message and select the appropriate solution.Check whether the service region in the body and the corresponding key value are co", + "product_code":"ocr", + "title":"Why Am I Experiencing Token Retrieval Failure When Calling an OCR API Using Postman?", + "uri":"ocr_01_0040.html", + "doc_type":"usermanual", + "p_code":"8", + "code":"11" + }, + { + "desc":"When calling an API, if you receive the error message \"code\":403,\"message\":\"This user only supports console access, not programmatic access.\", it means that you do not ha", + "product_code":"ocr", + "title":"Why Don't I Have Programming Access Permission?", + "uri":"ocr_01_0120.html", + "doc_type":"usermanual", + "p_code":"8", + "code":"12" + }, + { + "desc":"If the error message \"Error: Header overflow\" is displayed when calling an API using Postman to obtain an authentication token, it means that the header has exceeded its ", + "product_code":"ocr", + "title":"Why Is the Header Overflowing When I Call an OCR API Using Postman?", + "uri":"ocr_01_0133.html", + "doc_type":"usermanual", + "p_code":"8", + "code":"13" + }, + { + "desc":"HUAWEI CLOUD Help Center presents technical documents to help you quickly get started with HUAWEI CLOUD services. The technical documents include Service Overview, Price Details, Purchase Guide, User Guide, API Reference, Best Practices, FAQs, and Videos.", + "product_code":"ocr", + "title":"APIs", + "uri":"ocr_01_0046.html", + "doc_type":"usermanual", + "p_code":"7", + "code":"14" + }, + { + "desc":"Log in to the OCR console and choose Service Monitoring. On the Service Monitoring page displayed, view the total number of API calls, number of successful API calls, and", + "product_code":"ocr", + "title":"How Do I View the API Usage?", + "uri":"ocr_01_0056.html", + "doc_type":"usermanual", + "p_code":"14", + "code":"15" + }, + { + "desc":"No. It is not recommended to use one OCR API for a different function as each API has its own unique algorithms. Doing so may result in poor recognition results. For inst", + "product_code":"ocr", + "title":"Can I Use an OCR API for a Different Function?", + "uri":"ocr_01_0102.html", + "doc_type":"usermanual", + "p_code":"14", + "code":"16" + }, + { + "desc":"At present, the OCR API can only recognize one image per call. To process multiple images in batches, you would need to write additional code that allows for repeated cal", + "product_code":"ocr", + "title":"Does OCR Support Batch Processing?", + "uri":"ocr_01_0093.html", + "doc_type":"usermanual", + "p_code":"14", + "code":"17" + }, + { + "desc":"OCR results are initially in JSON format, but they can be easily saved as Word, TXT, or PDF files.", + "product_code":"ocr", + "title":"Is It Possible to Convert OCR Results into Word, TXT, or PDF Files?", + "uri":"ocr_01_0033.html", + "doc_type":"usermanual", + "p_code":"14", + "code":"18" + }, + { + "desc":"To recognize an image, it must first be converted to its Base64 coding. This section provides an example using Python to explain how to convert a local image to Base64 co", + "product_code":"ocr", + "title":"How Can I Get the Base64 Encoding of an Image?", + "uri":"ocr_01_0032.html", + "doc_type":"usermanual", + "p_code":"14", + "code":"19" + }, + { + "desc":"Cross-region OBS is not supported, and the OBS region must match the region of the service being called.For OBS resources with public read authorization, they can be acce", + "product_code":"ocr", + "title":"Is It Possible to Call the OCR Service From a Different Region Than OBS Resources?", + "uri":"ocr_01_0078.html", + "doc_type":"usermanual", + "p_code":"14", + "code":"20" + }, + { + "desc":"Use high-quality images with clear text and no glare as much as possible. When collecting images, try to increase the proportion of the area with text to be recognized, r", + "product_code":"ocr", + "title":"How Can I Improve Recognition Accuracy?", + "uri":"ocr_01_0122.html", + "doc_type":"usermanual", + "p_code":"14", + "code":"21" + }, + { + "desc":"The recognition speed is related to the image size, which affects the time required for network transmission and image Base64 decoding. To speed up image processing by th", + "product_code":"ocr", + "title":"How Can I Improve Recognition Speed?", + "uri":"ocr_01_0123.html", + "doc_type":"usermanual", + "p_code":"14", + "code":"22" + }, + { + "desc":"HUAWEI CLOUD Help Center presents technical documents to help you quickly get started with HUAWEI CLOUD services. The technical documents include Service Overview, Price Details, Purchase Guide, User Guide, API Reference, Best Practices, FAQs, and Videos.", + "product_code":"ocr", + "title":"How Do I Convert a PDF to an Image?", + "uri":"ocr_01_0125.html", + "doc_type":"usermanual", + "p_code":"14", + "code":"23" + }, + { + "desc":"HUAWEI CLOUD Help Center presents technical documents to help you quickly get started with HUAWEI CLOUD services. The technical documents include Service Overview, Price Details, Purchase Guide, User Guide, API Reference, Best Practices, FAQs, and Videos.", + "product_code":"ocr", + "title":"Error Codes", + "uri":"ocr_01_0076.html", + "doc_type":"usermanual", + "p_code":"7", + "code":"24" + }, + { + "desc":"Locate the cause based on the API calling result or error code.Check whether you have subscribed to the service.Check whether the token is correct or has expired.Check wh", + "product_code":"ocr", + "title":"Why Am I Unable to Call an OCR API?", + "uri":"ocr_01_0017.html", + "doc_type":"usermanual", + "p_code":"24", + "code":"25" + }, + { + "desc":"If an error message and error code are returned when an API is called:If error message \"Incorrect IAM authentication information: decrypt token fail\" and error code \"APIG", + "product_code":"ocr", + "title":"How Do I Handle the Error APIG.0301?", + "uri":"ocr_01_0063.html", + "doc_type":"usermanual", + "p_code":"24", + "code":"26" + }, + { + "desc":"If error message \"Backend timeout.\" and error code \"APIG.0201\" are displayed when you call an OCR API, the request timed out.Perform the following steps to rectify the fa", + "product_code":"ocr", + "title":"How Do I Handle the Error APIG.0201?", + "uri":"ocr_01_0065.html", + "doc_type":"usermanual", + "p_code":"24", + "code":"27" + }, + { + "desc":"The error message \"The API does not exist or has not been published in the environment.\" is displayed.Check whether the API URL is correct. For example, check whether the", + "product_code":"ocr", + "title":"How Do I Handle the Error APIG.0101?", + "uri":"ocr_01_0121.html", + "doc_type":"usermanual", + "p_code":"24", + "code":"28" + }, + { + "desc":"If error message and error code \"error_msg\":\"Orchestration error.\",\"error_code\":\"APIG.0106\" are returned when an API is called, check whether the frontend and backend par", + "product_code":"ocr", + "title":"How Do I Handle the Error APIG.0106?", + "uri":"ocr_01_0137.html", + "doc_type":"usermanual", + "p_code":"24", + "code":"29" + }, + { + "desc":"The error message \"The throttling threshold has been reached: policy user over ratelimit,limit:XX,time:1 minute\" and error code \"APIG.0308\" are displayed when you call an", + "product_code":"ocr", + "title":"How Do I Handle the Error APIG.0308?", + "uri":"ocr_01_0041.html", + "doc_type":"usermanual", + "p_code":"24", + "code":"30" + }, + { + "desc":"If error message \"The token must be updated.\" and error code \"APIG.0307\" are displayed when you call an OCR API, the token has expired and needs to be updated.Perform the", + "product_code":"ocr", + "title":"How Do I Handle the Error APIG.0307?", + "uri":"ocr_01_0089.html", + "doc_type":"usermanual", + "p_code":"24", + "code":"31" + }, + { + "desc":"If error message \"Invalid Token header. The Token not contain project item.\" and error code \"ModelArts.0212\" are displayed when you call an OCR API, the token is invalid ", + "product_code":"ocr", + "title":"How Do I Handle the Error ModelArts.0212?", + "uri":"ocr_01_0103.html", + "doc_type":"usermanual", + "p_code":"24", + "code":"32" + }, + { + "desc":"If \"error_code\":\"ModelArts.4603\",\"error_msg\":\"Obtaining the file from the URL failed.\" is displayed, it indicates that the image file fails to be obtained from the URL. T", + "product_code":"ocr", + "title":"How Do I Handle the Error ModelArts.4603 or ModelArts.4704?", + "uri":"ocr_01_0064.html", + "doc_type":"usermanual", + "p_code":"24", + "code":"33" + }, + { + "desc":"If \"\"error_code\":\"ModelArts.4204\",\"error_msg\":\"Request api error! The API XXX is not subscribed.\" is displayed when you call an API, it indicates that you have not subscr", + "product_code":"ocr", + "title":"How Do I Handle the Error ModelArts.4204?", + "uri":"ocr_01_0031.html", + "doc_type":"usermanual", + "p_code":"24", + "code":"34" + }, + { + "desc":"When calling an OCR API, the following error codes may occur related to image quality:Error code AIS.0102: Unsupported image format.Error code AIS.0103: Image size does n", + "product_code":"ocr", + "title":"How Do I Handle Image Quality Errors?", + "uri":"ocr_01_0096.html", + "doc_type":"usermanual", + "p_code":"24", + "code":"35" + } +] \ No newline at end of file diff --git a/docs/ocr/umn/PARAMETERS.txt b/docs/ocr/umn/PARAMETERS.txt new file mode 100644 index 00000000..6da8d5f0 --- /dev/null +++ b/docs/ocr/umn/PARAMETERS.txt @@ -0,0 +1,3 @@ +version="" +language="en-us" +type="" \ No newline at end of file diff --git a/docs/ocr/umn/en-us_image_0000001707122028.png b/docs/ocr/umn/en-us_image_0000001707122028.png new file mode 100644 index 00000000..dcf684db Binary files /dev/null and b/docs/ocr/umn/en-us_image_0000001707122028.png differ diff --git a/docs/ocr/umn/en-us_image_0000001707228246.png b/docs/ocr/umn/en-us_image_0000001707228246.png new file mode 100644 index 00000000..3d6f6338 Binary files /dev/null and b/docs/ocr/umn/en-us_image_0000001707228246.png differ diff --git a/docs/ocr/umn/en-us_image_0000001707396210.png b/docs/ocr/umn/en-us_image_0000001707396210.png new file mode 100644 index 00000000..d6f55cbd Binary files /dev/null and b/docs/ocr/umn/en-us_image_0000001707396210.png differ diff --git a/docs/ocr/umn/en-us_image_0000001707404882.png b/docs/ocr/umn/en-us_image_0000001707404882.png new file mode 100644 index 00000000..58cfdb76 Binary files /dev/null and b/docs/ocr/umn/en-us_image_0000001707404882.png differ diff --git a/docs/ocr/umn/en-us_image_0000001707409654.png b/docs/ocr/umn/en-us_image_0000001707409654.png new file mode 100644 index 00000000..1375e92b Binary files /dev/null and b/docs/ocr/umn/en-us_image_0000001707409654.png differ diff --git a/docs/ocr/umn/en-us_image_0000001755213865.png b/docs/ocr/umn/en-us_image_0000001755213865.png new file mode 100644 index 00000000..f7b42f24 Binary files /dev/null and b/docs/ocr/umn/en-us_image_0000001755213865.png differ diff --git a/docs/ocr/umn/en-us_image_0000001755220149.png b/docs/ocr/umn/en-us_image_0000001755220149.png new file mode 100644 index 00000000..f37637c4 Binary files /dev/null and b/docs/ocr/umn/en-us_image_0000001755220149.png differ diff --git a/docs/ocr/umn/ocr_01_0002.html b/docs/ocr/umn/ocr_01_0002.html new file mode 100644 index 00000000..1433ae4f --- /dev/null +++ b/docs/ocr/umn/ocr_01_0002.html @@ -0,0 +1,18 @@ + + +
Optical Character Recognition (OCR) detects and extracts text from images and converts the text into an editable JSON format.
+OCR provides open APIs, so you can use programming languages such as Python and Java to call OCR APIs to extract text from images. OCR allows you to automate the collection of key data. It helps you build an intelligent service system to improve efficiency. For details about how to obtain APIs, see Optical Character Recognition API Reference.
+You must have programming capabilities and be familiar with Java, Python, iOS, Android, and Node.js.
+To use OCR, call APIs to detect and extract text from images or scanned documents, convert the text into an editable JSON format, and enter the results into business systems by coding or save them in formats such as TXT or Excel.
+If you are a first-time user, the following sections are a good place to start:
+Going through this Functions section, you will be able to learn about a collection of OCR functions.
+OCR provides open APIs for you. For details about how to use OCR, see Getting Started and Optical Character Recognition API Reference.
+There are various factors, such as technology and cost, that limit the performance of OCR services. The system-level constraints are the most significant limitations that affect all sub-services. In addition to these system-level constraints, each sub-service also has its own independent limitations.
+Identity and Access Management (IAM) lets you control user authentication and access to OCR.
+Object Storage Service (OBS) is a stable, secure, efficient, and easy-to-use cloud storage service. OCR APIs involve processing user data, which can be efficiently handled in batches using OBS.
+OCR APIs allow for data retrieval and processing from OBS through temporary or anonymous public authorization.
+Detects and extracts text and their row and column locations from images of tables in various formats, as well as the text areas outside tables. It is used to store information on documents and reports as structured data.
+Detects and extracts text and their locations from images and converts them into structured data.
+If ""error_code":"ModelArts.4204","error_msg":"Request api error! The API XXX is not subscribed." is displayed when you call an API, it indicates that you have not subscribed to the API. For details about how to subscribe to an API, see Subscribing to a Service.
+If the service has been subscribed to, check whether the region (or account) where the service is subscribed to is the same as the region (or account) where the service is called. If they are the same, check whether the URL of the API is spelled correctly.
+To recognize an image, it must first be converted to its Base64 coding. This section provides an example using Python to explain how to convert a local image to Base64 code. You can also use an online conversion tool.
+Replace d:\demo.jpg in the code with the actual image path.
+import base64 +with open("d:\demo.jpg", "rb") as image_file: + encoded_string = base64.b64encode(image_file.read()).decode() +print(encoded_string)+
OCR results are initially in JSON format, but they can be easily saved as Word, TXT, or PDF files.
+When obtaining a token, refer to the error message and select the appropriate solution.
+Set username and domainname correctly. Typically, the value of username is the same as that of domainname. If you are not sure about the value, log in to the My Credentials page. If you use an IAM account to obtain the token, set the parameters as follows:
+username: IAM username (subaccount name)
+domainname: account name
+The error message "The throttling threshold has been reached: policy user over ratelimit,limit:XX,time:1 minute" and error code "APIG.0308" are displayed when you call an OCR API.
+Rectify the fault using either of the following methods:
+If you receive a 401 status code when calling the OCR service using a token, it means that the token has expired. The token is valid for 24 hours, so it is recommended to obtain a new token to call the OCR service.
+Log in to the OCR console and choose Service Monitoring. On the Service Monitoring page displayed, view the total number of API calls, number of successful API calls, and number of failed API calls.
+Log in to the OCR console and choose Service Monitoring. On the Service Monitoring page displayed, view the total number of API calls, number of successful API calls, and number of failed API calls.
+username indicates the name of the user, and domainname indicates the name of the account to which the user belongs.
+project name indicates the project name, for example, eu-de. For details about how to obtain a project ID, see "Obtaining a Project ID" in Optical Character Recognition API Reference.
+If an error message and error code are returned when an API is called:
+(1) Check whether the token has expired.
+(2) Check whether the request body is correct and whether the token is correct and complete.
+(3) Check whether the environment where the token is obtained is the same as that where the token is invoked.
+If "error_code":"ModelArts.4603","error_msg":"Obtaining the file from the URL failed." is displayed, it indicates that the image file fails to be obtained from the URL. To locate the fault, follow these steps:
+(1) Make sure that the provided URL supports the HTTP/HTTPS request protocol, which should be in the format of http/https URL.
+(2) Check if the server where the images are stored is stable and reliable, if the network connection is normal, and if it is publicly accessible.
+(3) Check if the content-type of the downloaded images is a standard type, such as image/gif, image/jpeg, image/png, image/tiff. It is recommended to use OBS URL for the request.
+If ""error_code":"ModelArts.4704","error_msg":"Obtaining the file from the OBS failed."" is displayed, it means that the image data fails to be obtained from OBS. Make sure that the OBS path where the images are stored exists and is accessible. If the path exists, make sure that the OBS bucket policy is set to public.
+It is recommended not to use OBS paths across regions. If OBS and the deployment region of the service to be called are not in the same region, it is suggested to download the images locally and call the service using the image method.
+If error message "Backend timeout." and error code "APIG.0201" are displayed when you call an OCR API, the request timed out.
+Perform the following steps to rectify the fault:
+Use a tool, such as Postman, to call the service and check whether the call is successful. If the call is successful, the service API is normal. Perform the following steps to proceed:
+If the fault persists, contact technical support.
+Cross-region OBS is not supported, and the OBS region must match the region of the service being called.
+For OBS resources with public read authorization, they can be accessed over the Internet and can support cross-region calls. Although this is convenient, there is a risk of sensitive information leakage, such as personal private data. It is recommended that you use OCR and OBS services in the same region to avoid this risk.
+If error message "The token must be updated." and error code "APIG.0307" are displayed when you call an OCR API, the token has expired and needs to be updated.
+Perform the following steps to rectify the fault:
+At present, the OCR API can only recognize one image per call. To process multiple images in batches, you would need to write additional code that allows for repeated calls to the OCR API.
+When calling an OCR API, the following error codes may occur related to image quality:
+No. It is not recommended to use one OCR API for a different function as each API has its own unique algorithms. Doing so may result in poor recognition results. For instance, using the General Text OCR API to recognize table images is not advised.
+If error message "Invalid Token header. The Token not contain project item." and error code "ModelArts.0212" are displayed when you call an OCR API, the token is invalid because the project information is missing.
+OCR is a project-level service. To obtain the token for calling an OCR API, you need to set scope to project.
+When calling an API, if you receive the error message "code":403,"message":"This user only supports console access, not programmatic access.", it means that you do not have the programming access permission.
+IAM users do not have programming access permissions. You need to contact the main account holder to adjust the permissions.
+The main account holder can log in to the IAM console, click Security Settings in the Operation column for the corresponding IAM user, and enter the settings page.
+The error message "The API does not exist or has not been published in the environment." is displayed.
+The recognition speed is related to the image size, which affects the time required for network transmission and image Base64 decoding. To speed up image processing by the OCR API, it is recommended that you compress the images while maintaining the required level of resolution before processing them. It is recommended to upload images in JPG format.
+Based on practical experience, it is generally recommended that small images of documents (with less text) should be below 1 MB, and large images of dense A4-sized documents should be below 2 MB.
+Refer to the following code for how to compress images:
+import cv2 +def resize_image(image, max_size): + """ + This code is used to expand or downsize an image proportionally. It compares the long side of the image with the input parameter max_size. If the long side of the image exceeds max_size, the image is downsized proportionally. Otherwise, the original image is returned. + :param max_size: maximum length of the long side of an image. (Set this parameter based on site requirements. You are advised to set this parameter to a value as small as possible as long as the resolution requirement is met.) + :return: returns the downsized image or the original image. + """ + + height, width = image.shape[:2] + max_side = max(height, width) + if max_side > max_size: + scale = max_size / max_side + image = cv2.resize(image, None, fx=scale, fy=scale) + + return image + +image = cv2.imread('test.png') +image = resize_image(image, max_size=1024)+
# -*- coding: utf-8 -*- +import os +import base64 +import fitz +import io +from PIL import Image +from glob import glob + +class CovertPdfToJpg: + def __init__(self, file_path, save_root): + self.file_path = file_path + self.save_root = save_root + + @staticmethod + def open_pdf(file): + return fitz.open(file) + + @staticmethod + def get_trans(doc, page, min_side=0, max_side=0, rotate=0.0): + """ Create a scale object. """ + region = doc[page].rect + scale = 1 + if max_side > min_side > 0: + scale = min_side / min(region.width, region.height) + if max(region.width, region.height) * scale > max_side: + scale = max_side / max(region.width, region.height) + trans = fitz.Matrix(scale, scale).preRotate(rotate) + return trans + + def page2pix(self, doc, page, trans): + """ Parse the current page as image data based on given parameters.""" + # Obtain the PDF format of a specified page. Note that page parameters need to be pre-parsed to avoid any issues. + return doc[page].getPixmap(matrix=trans, alpha=False) + + def pdf_to_jpg(self, width=1024, height=1400): + doc = self.open_pdf(self.file_path) + save_dir = os.path.join(self.save_root) + if not os.path.exists(save_dir): + os.makedirs(save_dir) + print("document", len(doc), doc.pageCount) + for i in range(len(doc)): + trans = self.get_trans(doc, i, width, height, rotate=0) + try: + pdf = self.page2pix(doc, i, trans) + except: + continue + image = pdf.getPNGData() + image = Image.open(io.BytesIO(image)) + print(os.path.join( + save_dir, os.path.basename(self.file_path).replace('.pdf', '') + '_' + str(i + 1) + '.jpg')) + image.save( + os.path.join(save_dir, os.path.basename(self.file_path).replace('.pdf', '') + '_' + str(i + 1) + '.jpg')) + return+
If the error message "Error: Header overflow" is displayed when calling an API using Postman to obtain an authentication token, it means that the header has exceeded its limit. To resolve this issue, follow these steps:
+Modify the environment variables of the operating system. In Windows 10, right click This PC and choose Properties. Click Advanced system settings. On the Advanced tab, click Environment Variables.... In the Environment Variables dialog box, click New... in the System variables area. Configure Variable name and Variable value as follows:
+If error message and error code "error_msg":"Orchestration error.","error_code":"APIG.0106" are returned when an API is called, check whether the frontend and backend parameters configured for the API are correct.
+This error is reported when the verification rule configured for frontend parameters of APIs is not met during API calling.
+Rectify the fault using either of the following methods:
+Use Postman to call the General Text OCR API.
+To call an OCR API, perform the following steps:
+Step 1: Subscribing to a Service
+Step 2: Configuring the Environment
+Step 3: Using a Token for Authentication
+ +Select a region based on service requirements. For details about the region where each service is deployed, see Regions and Endpoints.
+Tokens are used for identity authentication and permission management when calling an OCR API.
+Before calling an OCR API, you need first use the "Obtaining a Token" API to obtain the token value. Then, pass the token value into the request header parameter of the OCR API to authenticate the user's API request and enable the OCR service to verify their identity.
+The token is valid for 24 hours.
+To obtain the token, perform the following steps:
+{ + "auth": { + "identity": { + "methods": [ + "password" + ], + "password": { + "user": { + "name": "username", // IAM username + "password": "********", // User password + "domain": { + "name": "domainname" // Domain name + } + } + } + }, + "scope": { + "project": { + "name": "eu-de" + } + } + } +}+ +
Example: https://ocr.eu-de.otc.t-systems.com/v2/{project_id}/ocr/general-text
+{ + "image":"/9j/4AAQSkZJRgABAgEASABIAAD/4RFZRXhpZgAATU0AKgAAAA...", + "detect_direction":false, + "quick_mode":false + }+ +