How to Handle Multi-Column Text Sorting with Amazon Textract
前言
AWS Textract is an AWS tool used for extracting text from PDFs (or images). Ideally, your original document would have only one column, such as a book. However, things become more complex when dealing with multiple columns, such as newspaper articles. Therefore, this article aims to share how to use Amazon Textract to handle sorting of text from multi-column documents. This article is inspired by AWS Textract: how to detect and sort text from a multi-column document, with some improvements ma ...
使用HuggingFace模型 - 建立LangChain應用
想要透過langchain麻煩llm幫我做一些工作,但是有太多坑要補了…所以這篇文章主要來補坑的。
什麼是GGUF, GGML, GGJT
他們出現的順序大概是 GGML -> GGJT -> GGUF
GGUF (GPT-Generated Unified Format)
GGUF 是 llama.cpp 團隊於 2023 年 8 月 21 日推出的新格式,主要來替代GGML而llama.cpp也不再支援GGML
他有更多優勢像是更好的tokenisation, 支援special tokens並且更容易擴展
GGML (GPT-Generated Model Language)
最早可以追溯到 2022年10月,專門為Machine Learning 設計的 Tensor Library
目的是
GGUF Model
ref: https://huggingface.co/TheBloke/Llama-2-13B-chat-GGUF/blob/main/README.md
ref: 將 HuggingFace 模型轉換為 GGUF 及使用 llama. ...
Windows - A Step-by-Step Guide to Enable CUDA GPU on TensorFlow
Introduction
This section mainly teaches how to use GPU with TensorFlow. There are many installation tutorials available online, but if you followed the official tutorial step by step and still didn’t succeed, but ended up with the following setup:
CUDA 11.2 or higher
CUDA Toolkit 11.2 or higher
TensorFlow installed directly with the latest version (no version specified)
You find that when you execute the following code, it prints 0. If so, this article is what you need.
12import tensorflow as ...
MAC OS - PyTorch on Mac OS with GPU support
References
Installing GPU-supported PyTorch and TensorFlow on Mac M1/M2
Accelerated PyTorch training on Mac
Enabling GPU on Mac OS for PyTorch
Since I personally reinstalled GPU-supported PyTorch based on Anaconda, you can check whether Conda is installed by using the command conda --version. If it is installed, the output should confirm its presence. If not, you can download it from the Anaconda official website.
(Optional) If you want to create a separate environment specifically for Python ...
How to Intervene in Connection Pool?
Reference Links
Hikari Connection Pooling with Spring Boot
AWS RDS Proxy
Hikrai-Source Code
AWS Launches Proxy Service to Boost Scalability of Relational Database Applications
Hikari Configuration Settings
Inject Druid DataSource for Monitoring
How to Specify Your Own DataSource Extending Hirkari
How to Build a Practical RDS Proxy? (Part 1): Briefly mentions implementing an RDS Proxy using Golang
kingshard: An open-source MySQL Proxy written in Go, no longer actively maintained
MySQL Official C ...
NIST SP 800-209 Security Guidelines for Storage (1) Threats and Risks
References
NIST SP800-209 Security Guidelines for Storage Infrastructure
OWASP API Top 10, 2023
OWASP Top 10, 2021
OWASP Top 10, 2021: Summary by others with concise content.
OWASP Top 10, 2021: Presentation with clearer details on specific measures and threats.
Introduction
Based on the previous content and research topics, it is essential to first understand that the main purpose of this article is to organize:
[x] What are the relevant threats and risks of data storage?
[ ] How to prevent ...
NIST SP 800-209 Security Guidelines for Storage (2)
Introduction
Due to concerns about the length of the previous article NIST SP 800-209 Security Guidelines for Storage (1) Threats and Risks, a new post has been created to organize the content.
The previous article primarily addressed threats, risks, and attack surfaces related to inventorying storage infrastructure. This article focuses on summarizing security recommendations for storage deployments. The contribution of this thesis lies in emphasizing measures that align with zero trust require ...
Zero Trust Architecture (ZTA) Principles Applied to AP and DB
Reference Links
In 2020, NIST released NIST SP 800-207
In 2021, the U.S. Defense Information Systems Agency published the Department of Defense Zero Trust Reference Architecture
In September 2021, the U.S. Office of Management and Budget released a formal document after seeking input on the Federal Zero Trust Strategy
Introduction
As a graduate student researching security between databases (DB) and applications (AP), understanding the requirements and architecture of Zero Trust (ZTA) is cruci ...
JDBC Basics and Connection Pooling Explained
References
What Is JDBC?
(14) Dive into JDBC, Connection Pool, and Introduce H2 DB
Design Pattern - Object Pool
Understanding the Implementation and Principles of Database Connection Pool
Why Is Hikari So Fast
Difference Between spring-boot-starter-jdbc and spring-boot-starter-data-jdbc
Spring Boot - Data Properties
Introduction
JDBC, short for Java Database Connectivity, is primarily an API standard used for connecting Java programming language to databases. You can also think of it as a libr ...
Flower102 Dataset - Using Transfer Learning to train + Using Batch Normalization in CNN
Preface
I recently took an ai course, this is the fourth assignment and the main topics taught are the following.
selecting a dataset and training a model on it.
migration learning - fine tuning.
batch normalization in CNN.
The main references are the following websites: 1.
Flower102 dataset
Migration Learning
Pytorch dataset
Migration Learning Model
Shannon’s Transfer Learning Blog
Resnet18
Assignment Requirements
Tasks
Choose a dataset*: Look at torchvision Pytorch’s dataset and decide wh ...