简介:本文详细介绍如何在Springboot项目中整合百度OCR服务,实现身份证的自动识别与信息提取,包括环境准备、API调用、结果解析及异常处理。
在数字化办公场景中,身份证信息录入是高频需求。传统人工录入方式存在效率低、易出错等问题。通过OCR(光学字符识别)技术实现自动化识别,可显著提升效率。百度OCR提供的身份证识别API,支持正反面识别、关键字段提取等功能,结合Springboot框架可快速构建企业级应用。
<!-- pom.xml 依赖 --><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-web</artifactId></dependency><dependency><groupId>com.baidu.aip</groupId><artifactId>java-sdk</artifactId><version>4.16.11</version></dependency>
# application.ymlbaidu:ocr:api-key: your_api_keysecret-key: your_secret_keyaccess-token: # 通过程序动态获取
@Servicepublic class BaiduOCRService {@Value("${baidu.ocr.api-key}")private String apiKey;@Value("${baidu.ocr.secret-key}")private String secretKey;private String accessToken;@PostConstructpublic void init() throws Exception {AipClient client = new AipClient(apiKey, secretKey);this.accessToken = client.getAuthToken();}public String getAccessToken() {return accessToken;}}
@RestController@RequestMapping("/api/ocr")public class OCRController {@Autowiredprivate BaiduOCRService ocrService;@PostMapping("/idcard")public ResponseEntity<?> recognizeIdCard(@RequestParam("file") MultipartFile file,@RequestParam("side") String side) {try {// 1. 文件校验if (file.isEmpty()) {return ResponseEntity.badRequest().body("文件不能为空");}// 2. 调用OCR APIAipOcr client = new AipOcr(ocrService.getApiKey(),ocrService.getSecretKey(),ocrService.getAccessToken());byte[] imageBytes = file.getBytes();JSONObject res = client.idcard(imageBytes, side);// 3. 结果解析if (res.getInt("error_code") != 0) {return ResponseEntity.status(500).body("识别失败: " + res.getString("error_msg"));}return ResponseEntity.ok(res.getJSONObject("words_result"));} catch (Exception e) {return ResponseEntity.status(500).body("系统异常: " + e.getMessage());}}}
side参数:front(正面)或back(反面)
public List<IdCardInfo> batchRecognize(List<MultipartFile> files) {ExecutorService executor = Executors.newFixedThreadPool(5);List<CompletableFuture<IdCardInfo>> futures = new ArrayList<>();files.forEach(file -> {CompletableFuture<IdCardInfo> future = CompletableFuture.supplyAsync(() -> {// 调用单文件识别方法return recognizeSingle(file);}, executor);futures.add(future);});return futures.stream().map(CompletableFuture::join).collect(Collectors.toList());}
预处理建议:
public BufferedImage preprocessImage(BufferedImage image) {// 1. 二值化处理RescaleOp rescaleOp = new RescaleOp(1.0f, 127.0f, null);BufferedImage processed = rescaleOp.filter(image, null);// 2. 旋转校正(示例)if (needRotation(processed)) {AffineTransform rotate = AffineTransform.getRotateInstance(Math.toRadians(90),processed.getWidth()/2,processed.getHeight()/2);processed = new AffineTransformOp(rotate, AffineTransformOp.TYPE_BILINEAR).filter(processed, null);}return processed;}
@ControllerAdvicepublic class OCRExceptionHandler {@ExceptionHandler(AipException.class)public ResponseEntity<?> handleAipException(AipException e) {Map<String, Object> body = new HashMap<>();body.put("error_code", e.getErrorCode());body.put("message", e.getMessage());body.put("request_id", e.getRequestId());return ResponseEntity.status(429).body(body);}@ExceptionHandler(IOException.class)public ResponseEntity<?> handleIOException(IOException e) {return ResponseEntity.status(500).body("文件处理失败: " + e.getMessage());}}
连接池管理:
@Beanpublic AipClient aipClient(BaiduOCRService ocrService) {return new AipClient(ocrService.getApiKey(),ocrService.getSecretKey(),ocrService.getAccessToken()) {@Overridepublic CloseableHttpClient getHttpClient() {PoolingHttpClientConnectionManager cm = new PoolingHttpClientConnectionManager();cm.setMaxTotal(20);cm.setDefaultMaxPerRoute(5);return HttpClients.custom().setConnectionManager(cm).build();}};}
缓存策略:
Docker化部署:
FROM openjdk:11-jre-slimCOPY target/ocr-service.jar /app.jarEXPOSE 8080ENTRYPOINT ["java", "-jar", "/app.jar"]
监控指标:
detect_direction=true)
@Configurationpublic class WebConfig implements WebMvcConfigurer {@Overridepublic void addCorsMappings(CorsRegistry registry) {registry.addMapping("/**").allowedOrigins("*").allowedMethods("GET", "POST", "PUT", "DELETE").allowedHeaders("*");}}
通过Springboot整合百度OCR实现身份证识别,可构建高效、准确的自动化信息采集系统。实际部署时需重点关注:
未来可探索:
完整项目源码可参考GitHub示例仓库,建议从基础版本开始逐步迭代优化。