"Capek gak sih tiap mau analisa dokumen harus ribet OCR dulu? Mana hasilnya sering berantakan lagi ๐
Alibaba baru aja ngerilis Multimodal LLM yang bisa 'baca' PDF langsung kayak mata manusia.
No more text extraction, no more pre-processing. Langsung sat-set! ๐
Ini intinya nih....
Zero OCR Workflow: Biasanya kan kita butuh step ekstraksi teks dulu sebelum masuk ke AI. Nah, model ini bypass itu semua.
Dia lihat dokumen sebagai visual input, jadi struktur tabel atau grafik yang ribet pun dia paham. Literally kayak lo lagi baca pake mata sendiri.
No Pre-processing Drama: Gak perlu capek-capek cleaning data atau benerin format teks yang kepotong. Langsung arahin aja ke PDF-nya, dia bakal langsung nangkep konteksnya secara holistic.
Context Aware: Karena dia multimodal, dia gak cuma baca teks, tapi juga ngerti placement gambar dan layouting. Jadi minim halusinasi gara-gara salah baca urutan paragraf.
Why It Matters?
Buat lo yang kerjaannya numpuk research paper, laporan keuangan, atau legal dokumen yang berlembar-lembar, ini life saver banget sih. Efficiency level: God mode! ๐
Gimana menurut lo? Bakal bikin tools OCR lama jadi obsolete gak nih? ๐ง Ciakakakak
Nih reponya: https://t.co/jW5SSrOt2T
@ScholarshipfPhd I posted on this a few days ago. He has had various Google Scholar accounts, which he deleted, but they keep popping up again. The one I posted about has over five million citations and an h-index of >1,700.
https://t.co/4lGLYNnRMo
as long as they can't disobey of the pre-programmed instruction or what their algo should be, we're not there yet.
human is agi, that is capable of being disobedience. and this is not algorithmic.
I made a magic quadrant about how to choose your tech stack
Seriously, DYOR: no one will do it for you. Try out stuff. See what works. Ignore reports from people who don't do the work and are pay-to-play anyway
๐ง๐ผ๐ฝ ๐ฎ๐ฌ ๐ฆ๐ค๐ ๐พ๐๐ฒ๐ฟ๐ ๐ผ๐ฝ๐๐ถ๐บ๐ถ๐๐ฎ๐๐ถ๐ผ๐ป ๐๐ฒ๐ฐ๐ต๐ป๐ถ๐พ๐๐ฒ๐
Here is the list of the top 20 SQL query optimization techniques I found noteworthy:
1. Create an index on huge tables (>1.000.000) rows
2. Use EXIST() instead of COUNT() to find an element in the table
3. SELECT fields instead of using SELECT *
4. Avoid Subqueries in WHERE Clause
5. Avoid SELECT DISTINCT where possible
6. Use WHERE Clause instead of HAVING
7. Create joins with INNER JOIN (not WHERE)
8. Use LIMIT to sample query results
9. Use UNION ALL instead of UNION wherever possible
10. Use UNION where instead of WHERE ... or ... query.
11. Run your query during off-peak hours
12. Avoid using OR in join queries
14. Choose GROUP BY over window functions
15. Use derived and temporary tables
16. Drop the index before loading bulk data
16. Use materialized views instead of views
17. Avoid != or <> (not equal) operator
18. Minimize the number of subqueries
19. Use INNER join as little as possible when you can get the same output using LEFT/RIGHT join.
20. For retrieving the same dataset, frequently try to use temporary sources.
Do you know what is ๐ค๐๐ฒ๐ฟ๐ ๐ข๐ฝ๐๐ถ๐บ๐ถ๐๐ฒ๐ฟ? Its primary function is to determine ๐๐ต๐ฒ ๐บ๐ผ๐๐ ๐ฒ๐ณ๐ณ๐ถ๐ฐ๐ถ๐ฒ๐ป๐ ๐๐ฎ๐ to execute a given SQL query by finding the best execution plan. The query optimizer works by taking the SQL query as input and analyzing it to determine how best to execute it. The first step is to parse the SQL query and create a syntax tree. The optimizer then analyzes the syntax tree to determine how to run the query.
Next, the optimizer generates ๐ฎ๐น๐๐ฒ๐ฟ๐ป๐ฎ๐๐ถ๐๐ฒ ๐ฒ๐ ๐ฒ๐ฐ๐๐๐ถ๐ผ๐ป ๐ฝ๐น๐ฎ๐ป๐, which are different ways of executing the same query. Each execution plan specifies the order in which the tables should be accessed, the join methods, and any filtering or sorting operations. The optimizer then assigns a ๐ฐ๐ผ๐๐ to each execution plan based on the number of disk reads and the CPU time required to execute the query.
Finally, the optimizer ๐ฐ๐ต๐ผ๐ผ๐๐ฒ๐ ๐๐ต๐ฒ ๐ฒ๐ ๐ฒ๐ฐ๐๐๐ถ๐ผ๐ป ๐ฝ๐น๐ฎ๐ป with the lowest cost as the optimal execution plan for the query. This plan is then used to execute the query.
Check in the image the ๐ผ๐ฟ๐ฑ๐ฒ๐ฟ ๐ถ๐ป ๐๐ต๐ถ๐ฐ๐ต ๐ฆ๐ค๐ ๐พ๐๐ฒ๐ฟ๐ถ๐ฒ๐ ๐ฟ๐๐ป.
#technology #softwareengineering #programming #techworldwithmilan #sql