1. Data Deduplication In daily work, there may be data duplication when using Hive or Impala to query and export, but you don’t want to re-execute the query (the query time is a bit long and the exported file content is large), so you think of using Linux commands to remove duplicate data from the file content. The following is an example: You can see that aaa.txx has 3 duplicate data I want to remove the redundant data and keep only one sort aaa.txt | uniq > bbb.txt Remove duplicate data from the aaa.txt file and output it to bbb.txt You can see that only one piece of data is retained in the bbb.txt file 2. Data intersection, union, and difference 1) Intersection (equivalent to user_2019 inner join user_2020 on user_2019.user_no=user_2020.user_no) 2) Union (equivalent to user_2019.user_no union user_2020.user_no) 3) Difference
The above is the full content of this article. I hope it will be helpful for everyone’s study. I also hope that everyone will support 123WORDPRESS.COM. You may also be interested in:
|
<<: In-depth understanding of MySQL long transactions
>>: js to realize a simple disc clock
Before reading this article, it is best to have a...
Samba Services: This content is for reference of ...
Table of contents 8. CSS3 click button circular p...
Say it in advance Nodejs reads the database as an...
This article uses examples to describe how to bac...
A friend in the group asked a question before, th...
Let's briefly sort out the configuration of s...
Table of contents Vue CLI builds a Vue project Vu...
Table of contents 1. Basic use of axio 2. How to ...
The filter attribute defines the visual effect of...
1. Download the axios plugin cnpm install axios -...
This article shares the specific code of the vue3...
To install a virtual machine on a Windows system,...
This article shares the specific code of Vue recu...
Recently, I made a function similar to shake, usi...