简介:本文详细介绍Java开发者如何通过SDK与REST API两种方式对接百度AI文字识别服务,涵盖环境配置、核心代码实现、错误处理及优化策略。
百度AI文字识别服务提供通用文字识别(OCR)、高精度OCR、表格识别等12类接口,开发者需根据业务场景选择:
// 推荐使用环境变量存储敏感信息public class ConfigLoader {private static final String API_KEY = System.getenv("BAIDU_OCR_API_KEY");private static final String SECRET_KEY = System.getenv("BAIDU_OCR_SECRET_KEY");public static String getAccessToken() throws Exception {// 实现OAuth2.0认证逻辑}}
建议将API Key和Secret Key存储在环境变量或配置中心,避免硬编码在代码中。
Maven依赖配置:
<dependency><groupId>com.baidu.aip</groupId><artifactId>java-sdk</artifactId><version>4.16.11</version></dependency>
import com.baidu.aip.ocr.AipOcr;public class BaiduOCRClient {// 初始化客户端public static final AipOcr client = new AipOcr("APP_ID", "API_KEY", "SECRET_KEY");static {// 可选:设置网络连接参数client.setConnectionTimeoutInMillis(2000);client.setSocketTimeoutInMillis(60000);}public static String recognizeText(String imagePath) {// 调用通用文字识别接口JSONObject res = client.basicGeneral(imagePath, new HashMap<>());return parseResult(res);}private static String parseResult(JSONObject res) {// 解析识别结果JSONArray wordsResult = res.getJSONArray("words_result");StringBuilder sb = new StringBuilder();for (int i = 0; i < wordsResult.size(); i++) {sb.append(wordsResult.getJSONObject(i).getString("words")).append("\n");}return sb.toString();}}
basicGeneralBatch方法options参数指定language_typelocation字段获取文字坐标
import java.net.URI;import java.net.http.HttpClient;import java.net.http.HttpRequest;import java.net.http.HttpResponse;public class AuthHelper {public static String getAccessToken(String apiKey, String secretKey) throws Exception {String url = "https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials" +"&client_id=" + apiKey +"&client_secret=" + secretKey;HttpClient client = HttpClient.newHttpClient();HttpRequest request = HttpRequest.newBuilder().uri(URI.create(url)).GET().build();HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());JSONObject json = new JSONObject(response.body());return json.getString("access_token");}}
import java.net.URI;import java.net.http.*;import java.nio.file.*;import java.util.Base64;public class OCRDirectCaller {public static String callOCR(String accessToken, Path imagePath) throws Exception {String imageBase64 = Base64.getEncoder().encodeToString(Files.readAllBytes(imagePath));String url = "https://aip.baidubce.com/rest/2.0/ocr/v1/general_basic?access_token=" + accessToken;String requestBody = "{\"image\":\"" + imageBase64 + "\",\"language_type\":\"CHN_ENG\"}";HttpClient client = HttpClient.newHttpClient();HttpRequest request = HttpRequest.newBuilder().uri(URI.create(url)).header("Content-Type", "application/x-www-form-urlencoded").POST(HttpRequest.BodyPublishers.ofString(requestBody)).build();HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());return response.body();}}
| 错误码 | 含义 | 解决方案 |
|---|---|---|
| 110 | 访问频率受限 | 实现指数退避重试机制 |
| 111 | 凭证无效 | 检查密钥有效期及权限设置 |
| 120 | 图片处理失败 | 验证图片格式(支持JPG/PNG/BMP) |
| 121 | 图片尺寸超限 | 调整图片分辨率(建议<4MB) |
public class AsyncOCRProcessor {public static CompletableFuture<String> processAsync(String imagePath) {return CompletableFuture.supplyAsync(() -> {try {return BaiduOCRClient.recognizeText(imagePath);} catch (Exception e) {throw new CompletionException(e);}});}}
import org.slf4j.*;public class OCRLogger {private static final Logger logger = LoggerFactory.getLogger(OCRLogger.class);public static void logRequest(String requestId, long startTime) {logger.info("OCR Request [{}] started at {}", requestId, startTime);}public static void logResponse(String requestId, String result, long duration) {logger.info("OCR Request [{}] completed in {}ms. Result length: {}",requestId, duration, result.length());}}
import com.google.common.util.concurrent.RateLimiter;public class RateLimiterWrapper {private static final RateLimiter limiter = RateLimiter.create(5.0); // 每秒5次public static void acquire() {if (!limiter.tryAcquire()) {throw new RuntimeException("Rate limit exceeded");}}}
recog_grand参数优化手写识别效果
// 在options中设置自定义词汇Map<String, String> options = new HashMap<>();options.put("recognized_grand", "1"); // 启用精准模式options.put("word_sim_enable", "1"); // 启用相似字检测options.put("word_replace_type", "custom_words");options.put("custom_words", "百度,AI,OCR");
支持中英日韩等20种语言混合识别,通过language_type参数指定:
CHN_ENG:中英文混合JAP:日语KOR:韩语Q1:如何提高复杂背景下的识别准确率?
A:建议进行以下预处理:
Q2:单日调用量限制是多少?
A:默认免费额度为500次/日,超出后需购买套餐包。企业用户可申请提高配额。
Q3:支持哪些图片格式?
A:支持JPG、PNG、BMP格式,单图大小不超过4MB。
通过系统化的技术实现和优化策略,Java开发者可以高效稳定地集成百度AI文字识别服务。建议从SDK集成方案入手,逐步掌握REST API调用技巧,最终根据业务需求构建定制化的OCR解决方案。