建議閱讀網站

安裝包

1
2
3
4
5
6
# 習慣建立一個獨立的環境
conda create -n textract python=3.12
conda activate textract

# 安裝 boto3 為了可以使用aws的服務
pip install boto3

本地權限設置

根據官方說法,要先設定以下:

  1. 先建立一個IAM使用者,該使用者必須擁有4個權限分別是:
    • AmazonTextractFullAccess: 可以呼叫Textract的所有API
    • AmazonS3FullAccess: 因為分析的檔案會放在S3上,所以要有存取的權限
    • AmazonSQSFullAccessAmazonSNSFullAccess: 如果要使用異步檢測,就需要這個權限,為了把分析成功的狀態由SNS通知給SQS
  2. 建立存取金鑰
  3. ~/.aws/credentials 設定該金鑰
    1
    2
    3
    [default]
    aws_access_key_id = YOUR_ACCESS_KEY
    aws_secret_access_key = YOUR_SECRET_KEY
  4. ~/.aws/config 設定區域
    1
    2
    [default]
    region=us-east-1

這樣在執行 boto3.client(‘textract’) 時就會根據設定的金鑰和區域來進行操作。

Textract 非同步處理


Detecting or Analyzing Text in a Multipage Document - Amazon Textract 整體整個操作流程,從上圖可以看到,我們Client Application在呼叫Textract的StartDocumentTextDetection方法時:

  1. Textract 會收到檔案分析的請求,根據請求內容找到S3上的檔案
  2. 開始分析,會回傳一個JobId讓使用者可以根據此JobId來檢索SQS中是否有Completion Status的JobId
  3. SNS會通知SQS,SQS會將結果放到Queue中
  4. Client Application會不斷的檢查SQS中是否有Step2回傳的JobId,如果有表示該Job已經完成,可以透過GetDocumentTextDetection來取得結果。
  5. 透過GetDocumentTextDetection取得結果後,就可以進行後續的處理。

Step2: 取得RoleArn

當異步操作完成時,Amazon Textract 需要獲得許可才能向您的 Amazon SNS 主題發送消息。您可以使用 IAM 服務角色,讓 Amazon Textract 存取 Amazon SNS 主題。創建 Amazon SNS 主題時,必須在主題名稱前加上AmazonTextract— 例如,AmazonTextractMyTopicName

  1. 登入 IAM 主控台 (https://console.aws.amazon.com/iam)。
  2. 在導覽窗格中,選擇 Roles (角色)。
  3. 選擇 Create Role (建立角色)。
  4. 對於 Select type of trusted entity (選取信任的實體類型),選擇 AWS service (AWS 服務)。
  5. 適用於選擇將使用此角色的服務,選擇Textract。
  6. 選擇 Next: (下一步:) Permissions (許可)。
  7. 驗證AmazonTextractServiceRole策略已包含在附加策略列表中。若要在清單中顯示政策,請在篩選政策。
  8. 選擇 Next: (下一步:) Tags (標籤)。
  9. 您不需要新增標籤,所以請選擇下一頁: Review (檢閱)。
  10. 在 Review (檢閱) 區段中,針對 Role name (角色名稱),輸入角色的名稱 (例如,TextractRole)。In角色描述,請更新該角色的描述,然後選擇建立角色。
  11. 選擇新角色來開啟角色的詳細資訊頁面。
  12. 在 Summary (摘要) 中,複製 Role ARN (角色 ARN) 值,並將其儲存。
  13. 選擇 Trust relationships (信任關係)。
  14. 選擇編輯信任關係,並確保信任策略如下所示。為了防止混淆的代理問題,請確保信任策略包含限制權限範圍的條件。有關此潛在安全問題的更多詳細信息,請參閱跨服務混淆的代理預防措施。在下面的示例中,將123456789012文本替換為您的 AWS 帳戶 ID。
  15. 選擇Update Trust Policy更新信任政策。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
{
"Version": "2012-10-17",
"Statement": {
"Sid": "ConfusedDeputyPreventionExamplePolicy",
"Effect": "Allow",
"Principal": {
"Service": "textract.amazonaws.com"
},
"Action": "sts:AssumeRole",
"Condition": {
"ArnLike": {
"aws:SourceArn":"arn:aws:textract:*:123456789012:*"
},
"StringEquals": {
"aws:SourceAccount": "123456789012"
}
}
}
}

結果如下:


Step 3: SQS and SNS

將許可提供給 Amazon SNS Topic,以將訊息傳送至 Amazon SQS 佇列。為了讓 Amazon SNS 主題能夠傳送訊息至Queue,您必須對Queue設定政策,允許 Amazon SNS 主題執行 sqs:SendMessage 動作。在您訂閱Queue到Topic之前,您需要建立Topic和Queue。如果您尚未建立Topic和Queue,請現在建立。如需詳細資訊,請參閱 建立Topic,並參閱 Amazon Queue Service 開發人員指南中的建立Queue

使用 Amazon SQS 主控台設定佇列的 SendMessage 政策

  1. 登入 AWS Management Console,並在 https://console.aws.amazon.com/sqs/ 開啟 Amazon SQS 主控台。
  2. 選取您要設定其政策之佇列的方塊,選擇 Access policy (存取政策) 索引標籤,然後選擇 Edit (編輯)
  3. 在存取政策區段中,定義誰可以存取您的佇列。
    • 新增條件以允許用於主題的動作。
    • 將 Principal 設定為 Amazon SNS 服務,如下列範例所示。
    • 使用 aws:SourceArn 或者 aws:SourceAccount 全域條件金鑰,以防止混淆代理人案例。如要使用這些條件金鑰,請將值設定為主題的 ARN。若您的佇列訂閱了多個主題,則可改用 aws:SourceAccou

例如,下列政策允許 MyTopic 傳送訊息至 MyQueue。請取代123456789012為您的帳戶ID。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
{
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "sns.amazonaws.com"
},
"Action": "sqs:SendMessage",
"Resource": "arn:aws:sqs:us-east-2:123456789012:MyQueue",
"Condition": {
"ArnEquals": {
"aws:SourceArn": "arn:aws:sns:us-east-2:123456789012:MyTopic"
}
}
}
]
}

python code

可以參考 Detecting or Analyzing Text in a Multipage Document - Amazon Textract ,附上完整的Python程式碼,詳細的介紹如何透過Python實現上述講到的所有流程。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
import boto3
import json
import sys
import time


class ProcessType:
DETECTION = 1
ANALYSIS = 2


class DocumentProcessor:
jobId = ''
region_name = ''

roleArn = ''
bucket = ''
document = ''

sqsQueueUrl = ''
snsTopicArn = ''
processType = ''

def __init__(self, role, bucket, document, region):
self.roleArn = role
self.bucket = bucket
self.document = document
self.region_name = region

self.textract = boto3.client('textract', region_name=self.region_name)
self.sqs = boto3.client('sqs')
self.sns = boto3.client('sns')

def ProcessDocument(self, type):
jobFound = False

self.processType = type
validType = False

# Determine which type of processing to perform
if self.processType == ProcessType.DETECTION:
response = self.textract.start_document_text_detection(
DocumentLocation={'S3Object': {'Bucket': self.bucket, 'Name': self.document}},
NotificationChannel={'RoleArn': self.roleArn, 'SNSTopicArn': self.snsTopicArn})
print('Processing type: Detection')
validType = True

if self.processType == ProcessType.ANALYSIS:
response = self.textract.start_document_analysis(
DocumentLocation={'S3Object': {'Bucket': self.bucket, 'Name': self.document}},
FeatureTypes=["TABLES", "FORMS"],
NotificationChannel={'RoleArn': self.roleArn, 'SNSTopicArn': self.snsTopicArn})
print('Processing type: Analysis')
validType = True

if validType == False:
print("Invalid processing type. Choose Detection or Analysis.")
return

print('Start Job Id: ' + response['JobId'])
dotLine = 0
while jobFound == False:
sqsResponse = self.sqs.receive_message(QueueUrl=self.sqsQueueUrl, MessageAttributeNames=['ALL'],
MaxNumberOfMessages=10)

if sqsResponse:

if 'Messages' not in sqsResponse:
if dotLine < 40:
print('.', end='')
dotLine = dotLine + 1
else:
print()
dotLine = 0
sys.stdout.flush()
time.sleep(5)
continue

for message in sqsResponse['Messages']:
notification = json.loads(message['Body'])
textMessage = json.loads(notification['Message'])
print(textMessage['JobId'])
print(textMessage['Status'])
if str(textMessage['JobId']) == response['JobId']:
print('Matching Job Found:' + textMessage['JobId'])
jobFound = True
self.GetResults(textMessage['JobId'])
self.sqs.delete_message(QueueUrl=self.sqsQueueUrl,
ReceiptHandle=message['ReceiptHandle'])
else:
print("Job didn't match:" +
str(textMessage['JobId']) + ' : ' + str(response['JobId']))
# Delete the unknown message. Consider sending to dead letter queue
self.sqs.delete_message(QueueUrl=self.sqsQueueUrl,
ReceiptHandle=message['ReceiptHandle'])

print('Done!')

def CreateTopicandQueue(self):

millis = str(int(round(time.time() * 1000)))

# Create SNS topic
snsTopicName = "AmazonTextractTopic" + millis

topicResponse = self.sns.create_topic(Name=snsTopicName)
self.snsTopicArn = topicResponse['TopicArn']

# create SQS queue
sqsQueueName = "AmazonTextractQueue" + millis
self.sqs.create_queue(QueueName=sqsQueueName)
self.sqsQueueUrl = self.sqs.get_queue_url(QueueName=sqsQueueName)['QueueUrl']

attribs = self.sqs.get_queue_attributes(QueueUrl=self.sqsQueueUrl,
AttributeNames=['QueueArn'])['Attributes']

sqsQueueArn = attribs['QueueArn']

# Subscribe SQS queue to SNS topic
self.sns.subscribe(
TopicArn=self.snsTopicArn,
Protocol='sqs',
Endpoint=sqsQueueArn)

# Authorize SNS to write SQS queue
policy = """{{
"Version":"2012-10-17",
"Statement":[
{{
"Sid":"MyPolicy",
"Effect":"Allow",
"Principal" : {{"AWS" : "*"}},
"Action":"SQS:SendMessage",
"Resource": "{}",
"Condition":{{
"ArnEquals":{{
"aws:SourceArn": "{}"
}}
}}
}}
]
}}""".format(sqsQueueArn, self.snsTopicArn)

response = self.sqs.set_queue_attributes(
QueueUrl=self.sqsQueueUrl,
Attributes={
'Policy': policy
})

def DeleteTopicandQueue(self):
self.sqs.delete_queue(QueueUrl=self.sqsQueueUrl)
self.sns.delete_topic(TopicArn=self.snsTopicArn)

# Display information about a block
def DisplayBlockInfo(self, block):

print("Block Id: " + block['Id'])
print("Type: " + block['BlockType'])
if 'EntityTypes' in block:
print('EntityTypes: {}'.format(block['EntityTypes']))

if 'Text' in block:
print("Text: " + block['Text'])

if block['BlockType'] != 'PAGE':
print("Confidence: " + "{:.2f}".format(block['Confidence']) + "%")

print('Page: {}'.format(block['Page']))

if block['BlockType'] == 'CELL':
print('Cell Information')
print('\tColumn: {} '.format(block['ColumnIndex']))
print('\tRow: {}'.format(block['RowIndex']))
print('\tColumn span: {} '.format(block['ColumnSpan']))
print('\tRow span: {}'.format(block['RowSpan']))

if 'Relationships' in block:
print('\tRelationships: {}'.format(block['Relationships']))

print('Geometry')
print('\tBounding Box: {}'.format(block['Geometry']['BoundingBox']))
print('\tPolygon: {}'.format(block['Geometry']['Polygon']))

if block['BlockType'] == 'SELECTION_ELEMENT':
print(' Selection element detected: ', end='')
if block['SelectionStatus'] == 'SELECTED':
print('Selected')
else:
print('Not selected')

def GetResults(self, jobId):
maxResults = 1000
paginationToken = None
finished = False

while finished == False:

response = None

if self.processType == ProcessType.ANALYSIS:
if paginationToken == None:
response = self.textract.get_document_analysis(JobId=jobId,
MaxResults=maxResults)
else:
response = self.textract.get_document_analysis(JobId=jobId,
MaxResults=maxResults,
NextToken=paginationToken)

if self.processType == ProcessType.DETECTION:
if paginationToken == None:
response = self.textract.get_document_text_detection(JobId=jobId,
MaxResults=maxResults)
else:
response = self.textract.get_document_text_detection(JobId=jobId,
MaxResults=maxResults,
NextToken=paginationToken)

blocks = response['Blocks']
print('Detected Document Text')
print('Pages: {}'.format(response['DocumentMetadata']['Pages']))

# Display block information
for block in blocks:
self.DisplayBlockInfo(block)
print()
print()

if 'NextToken' in response:
paginationToken = response['NextToken']
else:
finished = True

def GetResultsDocumentAnalysis(self, jobId):
maxResults = 1000
paginationToken = None
finished = False

while finished == False:

response = None
if paginationToken == None:
response = self.textract.get_document_analysis(JobId=jobId,
MaxResults=maxResults)
else:
response = self.textract.get_document_analysis(JobId=jobId,
MaxResults=maxResults,
NextToken=paginationToken)

# Get the text blocks
blocks = response['Blocks']
print('Analyzed Document Text')
print('Pages: {}'.format(response['DocumentMetadata']['Pages']))
# Display block information
for block in blocks:
self.DisplayBlockInfo(block)
print()
print()

if 'NextToken' in response:
paginationToken = response['NextToken']
else:
finished = True


def main():
roleArn = ''
bucket = ''
document = ''
region_name = ''

analyzer = DocumentProcessor(roleArn, bucket, document, region_name)
analyzer.CreateTopicandQueue()
analyzer.ProcessDocument(ProcessType.DETECTION)
analyzer.DeleteTopicandQueue()


if __name__ == "__main__":
main()

Textract 同步處理

DetectDocumentText 是可以分析文件中的 line 或是 word 字段和位置。AWS Textract 可以選擇要透過S3 或是傳送 base64 編碼的bytes來進行分析。根據官方對偵測文字的說法,回應返回文檔中檢操到的:

  • 文本的line和word
  • 文本中line裡面和word之間的關係
  • 檢測到的文本顯示在圖片上哪一個位置

在這裏,我們是使用同步檢測(一個做完才做下一個),如果要使用異步檢測可以參考StartDocumentTextDetection,並且透過GetDocumentTextDetection來取得結果。如需詳細資訊和範例,請參閱 使用異步操作處理文檔

建立client

我們要先建立一個client,然後使用 detect_document_text 來進行分析。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

# AWS
import boto3

# 圖片處理
import io
from io import BytesIO
from PIL import Image, ImageDraw, ImageFont
from pdf2image import convert_from_bytes



# 讀取pdf
pdf_file = 'test.pdf' # 設定想要處理的檔案
with open(pdf_file, 'rb') as file:
pdf_binary = file.read()

# aws client
client = boto3.client('textract', region_name='us-east-1')

# textract
response = client.detect_document_text(Document={'Bytes': pdf_binary})

處理回傳的資料

以下程式碼是官方提供的範例程式,主要可以看到如何處理回傳的資料,並且在圖片上畫出文字的位置。他的Bounding Box是一個比例,所以要乘上圖片的寬高才能得到正確的位置。

檢測到的文本將在Text欄位Block物件。所以此BlockType字段確定文本是一行文本 (LINE) 還是單詞 (WORD)。一個字是一或多個 ISO 基本拉丁腳本字符,不以空格分隔。一個線是製表符分隔和連續單詞的字符串。

最後bounding box的位置由以下組成:用來定義邊界框的四個點。[(x0, y0), (x1, y1)] 或 [x0, y0, x1, y1] 的數列

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
def process_text_detection():
images = convert_from_bytes(pdf_binary)
image = images[0]

#Get the text blocks
blocks=response['Blocks']
width, height =image.size
draw = ImageDraw.Draw(image)
print ('Detected Document Text')

# Create image showing bounding box/polygon the detected lines/text
for block in blocks:
print('Type: ' + block['BlockType'])
if block['BlockType'] != 'PAGE':
print('Detected: ' + block['Text'])
print('Confidence: ' + "{:.2f}".format(block['Confidence']) + "%")

print('Id: {}'.format(block['Id']))
if 'Relationships' in block:
print('Relationships: {}'.format(block['Relationships']))
print('Bounding Box: {}'.format(block['Geometry']['BoundingBox']))
print('Polygon: {}'.format(block['Geometry']['Polygon']))
print()
draw=ImageDraw.Draw(image)
# Draw WORD - Green - start of word, red - end of word
if block['BlockType'] == "WORD":
draw.line([(width * block['Geometry']['Polygon'][0]['X'],
height * block['Geometry']['Polygon'][0]['Y']),
(width * block['Geometry']['Polygon'][3]['X'],
height * block['Geometry']['Polygon'][3]['Y'])],fill='green',
width=2)

draw.line([(width * block['Geometry']['Polygon'][1]['X'],
height * block['Geometry']['Polygon'][1]['Y']),
(width * block['Geometry']['Polygon'][2]['X'],
height * block['Geometry']['Polygon'][2]['Y'])],
fill='red',
width=2)


# Draw box around entire LINE
if block['BlockType'] == "LINE":
points=[]

for polygon in block['Geometry']['Polygon']:
points.append((width * polygon['X'], height * polygon['Y']))

draw.polygon((points), outline='black')

# Uncomment to draw bounding box
box=block['Geometry']['BoundingBox']
left = width * box['Left']
top = height * box['Top']
draw.rectangle([left,top, left + (width * box['Width']), top +(height * box['Height'])],outline='yellow')


# Display the image
image.show()
# display image for 10 seconds


return len(blocks)

結果

詳細規格可以餐考文本檢測和文檔分析響應對象

但是你會發現一個叫做confidence的欄位,從上圖跟下面的json可以看到,#Get x the * document IN from IN S3這段x*還有IN都不應該出現,而他們的confidence也很低,分別都是25。應該要進行filter掉信心不高的內容。根據不同的場景,可信度低的檢測可能需要人類的視覺確認。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
{
"DocumentMetadata": { "Pages": 1 },
"Blocks": [
{
"BlockType": "PAGE",
"Geometry": {
"BoundingBox": {
"Width": 1.0,
"Height": 0.9976544976234436,
"Left": 0.0,
"Top": 0.0
},
"Polygon": [
{ "X": 0.0, "Y": 0.0 },
{ "X": 1.0, "Y": 2.8319884677330265e-6 },
{ "X": 1.0, "Y": 0.9966249465942383 },
{ "X": 0.0, "Y": 0.9976544976234436 }
]
},
"Id": "964301ea-606d-4842-b329-e935cdd1ccac",
"Relationships": [
{
"Type": "CHILD",
"Ids": [
"f4e39047-43f0-468b-a10b-d8718af58a9a",
"d4d91acf-da43-4c51-828d-a99528296d76",
"e02398d3-94f9-4dc6-9d5a-9259fa6fbe1d",
"d331852e-b1e7-474a-9906-200fea91a887",
"6299c37e-3586-49bf-8804-a7660e998acc",
"cc4dda56-bff8-4c37-962b-6f9112680dc7"
]
}
]
},
{
"BlockType": "LINE",
"Confidence": 70.04537200927734,
"Text": "#Get x the * document IN from IN S3",
"Geometry": {
"BoundingBox": {
"Width": 0.30295300483703613,
"Height": 0.014886284247040749,
"Left": 0.15722371637821198,
"Top": 0.11024140566587448
},
"Polygon": [
{ "X": 0.15722371637821198, "Y": 0.11024140566587448 },
{ "X": 0.4601767361164093, "Y": 0.11063025891780853 },
{ "X": 0.46017175912857056, "Y": 0.12512768805027008 },
{ "X": 0.15722690522670746, "Y": 0.12475030869245529 }
]
},
"Id": "f4e39047-43f0-468b-a10b-d8718af58a9a",
"Relationships": [
{
"Type": "CHILD",
"Ids": [
"005d8051-a3e8-414f-b509-3e3f70dce56c",
"2e14dc51-9d9d-4740-82fb-2f6710febacf", # 這裡要注意
"d6ef827c-d9fa-488c-b295-8d8a0b20ee86",
"5224fd43-c247-4f96-aa67-ae4e9d63b3f1",
"c78aca25-1f8d-4d29-9b27-5f06fb32561e",
"c0bdce47-96af-4b3e-aacd-672deec8c2d2",
"e1f3f661-3f2a-4ad6-9875-835d05e18132",
"88b396a3-1f7c-455b-b73f-4ed5e2ef300a",
"6d9b7d1d-0a22-4a53-a141-12ce50157b64"
]
}
]
},
...
},
{
"BlockType": "WORD",
"Confidence": 25.193822860717773,
"Text": "x", # 圖片這個根本不是x,因死這個confidence也很低,應該要filter掉
"TextType": "PRINTED",
"Geometry": {
"BoundingBox": {
"Width": 0.004908399190753698,
"Height": 0.0033209826797246933,
"Left": 0.20951908826828003,
"Top": 0.1167941763997078
},
"Polygon": [
{ "X": 0.20951908826828003, "Y": 0.1167941763997078 },
{ "X": 0.21442709863185883, "Y": 0.11680039763450623 },
{ "X": 0.21442748606204987, "Y": 0.12011516094207764 },
{ "X": 0.20951949059963226, "Y": 0.12010898441076279 }
]
},
"Id": "2e14dc51-9d9d-4740-82fb-2f6710febacf"
},

補充內容

關於文字的位置

若要確定項目在文件頁面上的位置,請使用週框(Geometry)由 Amazon Textract 操作返回的信息Block物件。所以此Geometry物件包含兩類檢測到的項目的位置和幾何資訊:

  • 軸對齊BoundingBox物件,該物件包含左上方座標以及項目的寬度和高度。
  • 描述項目輪廓的多邊形對象,指定為Point對象包含X(水平軸) 和Y(垂直軸)每個點的文檔頁面座標。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
{
"Geometry": {
"BoundingBox": {
"Width": 0.053907789289951324,
"Top": 0.08913730084896088,
"Left": 0.11085548996925354,
"Height": 0.013171200640499592
},
"Polygon": [
{# 起點的線
"Y": 0.08985357731580734,
"X": 0.11085548996925354
},
{
"Y": 0.08913730084896088,
"X": 0.16447919607162476
},
{
"Y": 0.10159222036600113,
"X": 0.16476328670978546
},
{# 終點的線
"Y": 0.10230850428342819,
"X": 0.11113958805799484
}
]
},
"Text": "Name:",
"TextType": "PRINTED",
"BlockType": "WORD",
"Confidence": 99.56285858154297,
"Id": "c734fca6-c4c4-415c-b6c1-30f7510b72ee"
},