Tuesday, 10 October 2017

5 ways to count the lines of a file in Linux

There are multiple ways to count the number of lines of a file in Linux. In our daily life we need to count number of lines of a csv file, text file etc and the most popular command we use "wc -l". In this article i will show you 5 different ways to find number of line along with wc -l



Let us consider we have 2 sample file name as sample1.txt and sample2.txt which having 10 and 3 lines as following. I will show you different examples to get the number of lines.

Contents of sample1.txt
[~]$ cat sample1.txt

One
Two
Three
four
five
six This is longest line
seven
eight
nine
ten

Contents of sample2.txt

[~]$ cat sample2.txt
One
Two Longest Line
Three

The wc (word count) command is very popular in Unix/Linux to find number of lines count, word counts, byte and characters count in a file. The syntax of wc command is as following.
wc  [OPTION]... [FILE]...

Where OPTION are as below:

  -c, --bytes            print the byte counts
  -m, --chars            print the character counts
  -l, --lines            print the newline counts
      --files0-from=F    read input from the files specified by
                           NUL-terminated names in file F;
                           If F is - then read names from standard input
  -L, --max-line-length  print the length of the longest line
  -w, --words            print the word counts

[~]$ wc sample1.txt sample2.txt

10 14 70 sample1.txt
 3  5 27 sample2.txt
13 19 97 total

In the above example first column shows number of line , 2nd column shows words and 3rd column shows number of chars. Last row shows the total counts of all files.

Lets us have a look of different ways to find the number of lines in a file , we will use sample1.txt in our demo.

1) Counting lines with WC command

[~]$ cat sample1.txt | wc -l
10
or
[~]$ wc -l sample1.txt
10 sample1.txt

2) Counting lines with sed command
sed uses "=" operator for line numbering and "$" gives the last count of numbering which is total number of lines.
[~]$ sed -n "$=" sample1.txt
10

3) grep command with -c option:

To count all non-empty lines or non-blank line.
[~]$ grep -c "." sample2.txt
3
To count all line including blank or empty lines.
[~]$ grep -c ".*" sample2.txt
4
or
[~]$ grep -c "^" sample2.txt
4

4). Counting line with awk:

awk with NR variable gives the line numbers and by printing NR with the end block it gives the line number of the last line which is nothing but a total number of lines in file.
[~]$ awk 'END {print NR}' sample1.txt
10

5). Counting lines with perl:

End block as in awk we can use also with perl. In Perl "$." gives the number of lines.
[~]$ perl -lne 'END {print $.}' sample1.txt
10

Note:

To find the longest line character count we can use wc with -L option. As shown below in sample1.txt file the longest line having 24 character where as in sample2.txt it is 16.
[~]$ wc -L sample1.txt sample2.txt
24 sample1.txt
16 sample2.txt
24 total

Have a look on couple of heck of counting lines.

Trick 01:
cat -n sample1.txt | tail -n 1 | cut -f1

Explanation : "cat -n sample1.txt" keeps a number line in file and pipe "tail -n1" gives the first row from bottom which includes line number and last line content. Then pipe "cut -f1" take out the first field of last line(last line number) which is nothing but the total count of lines.

Trick 02:

Using the while loop reading file line by line and increasing counter.

#!/bin/bash

count=0
while read
do
  ((count=$count+1))
done <sample1.txt

echo $count



***End***

Saturday, 7 October 2017

Split command in Linux/Unix

Split command is very useful when you are managing large file . Consider you have a csv file with millions of records and its taking too much time to open. In this case we can split file into small pieces and can access it easily in any GUI.


The default size for each split file is 1000 lines, and default PREFIX is "x". However we can split file based of number of lines or bytes and can change the prefix as well. In this article i will show you how to use split command with examples.

Let us consider we have a file testfile.csv with 1342 records.
[~]$ cat testfile.csv | wc -l
1342

1). Split simple example :

As you can see below split command split file testfile.csv in 2 pieces with default prefix x. testfile.csv file having total 1342 records hence by default it split first file name as xaa with default 1000 line and second file name as xab with remaining records 342.
[~]$ split testfile.csv

[~]$ ls
testfile.csv  xaa  xab

[~]$ cat xaa | wc -l
1000

[~]$ cat xab | wc -l
342

2) Split file with specific number of lines:

We can use -l option with split command to achieve specific number of line into split files. Let us we want to split file with 500 records for each then use following command.
[~]$ split -l 500 testfile.csv

[~]$ ls
testfile.csv  xaa  xab  xac

[~]$ cat xaa | wc -l
500

[~]$ cat xab | wc -l
500

[~]$ cat xac | wc -l
342

3) Split file with a specific prefix:

If we want to use our own prefix  "NEW" in split files use the following command.
[~]$ split -l 500 testfile.csv NEW

[~]$ ls
NEWaa  NEWab  NEWac  testfile.csv

4) Split file with numeric suffix:

We can append our own numeric suffix like 00,01,02... instead default xa,xb,xc .... with -d option as following.
[~]$ split -l 50 -d testfile.csv NEW

[~]$ ls
NEW00  NEW02  NEW04  NEW06  NEW08  NEW10  NEW12  NEW14  NEW16  NEW18  NEW20  NEW22  NEW24  NEW26
NEW01  NEW03  NEW05  NEW07  NEW09  NEW11  NEW13  NEW15  NEW17  NEW19  NEW21  NEW23  NEW25  testfile.csv

By default numeric suffix has 2 digits and you may need to increase the number of digits if split files crossing more than 100 files. In that case you will get following "suffixes exhausted" message and you may loose some split files after NEW99.
[~]$ split -l 10 -d testfile.csv NEW
split: output file suffixes exhausted

[~]$ ls
NEW00  NEW05  NEW10  NEW15  NEW20  NEW25  NEW30  NEW35  NEW40  NEW45  NEW50  NEW55  NEW60  NEW65  NEW70  NEW75  NEW80  NEW85  NEW90  NEW95  testfile.csv
NEW01  NEW06  NEW11  NEW16  NEW21  NEW26  NEW31  NEW36  NEW41  NEW46  NEW51  NEW56  NEW61  NEW66  NEW71  NEW76  NEW81  NEW86  NEW91  NEW96
NEW02  NEW07  NEW12  NEW17  NEW22  NEW27  NEW32  NEW37  NEW42  NEW47  NEW52  NEW57  NEW62  NEW67  NEW72  NEW77  NEW82  NEW87  NEW92  NEW97
NEW03  NEW08  NEW13  NEW18  NEW23  NEW28  NEW33  NEW38  NEW43  NEW48  NEW53  NEW58  NEW63  NEW68  NEW73  NEW78  NEW83  NEW88  NEW93  NEW98
NEW04  NEW09  NEW14  NEW19  NEW24  NEW29  NEW34  NEW39  NEW44  NEW49  NEW54  NEW59  NEW64  NEW69  NEW74  NEW79  NEW84  NEW89  NEW94  NEW99

To overcome this you can increase number of digits in suffix by using -a option as following.
[~]$ split -l 10 -a 3 -d testfile.csv NEW

[~]$ ls
NEW000  NEW007  NEW014  NEW021 .........  NEW099  NEW100  NEW101  .........  NEW132

5) Split file with 4000 bytes output:

We can use -b option with desired number of size.
[~]$ split -b4000 testfile.csv
              (or)
[~]$ split -b4k testfile.csv

[~]$ ls -ltr x*
-rw-rw-r-- 1 mukesh mukesh 3888 Oct  7 21:14 xae
-rw-rw-r-- 1 mukesh mukesh 4096 Oct  7 21:14 xad
-rw-rw-r-- 1 mukesh mukesh 4096 Oct  7 21:14 xac
-rw-rw-r-- 1 mukesh mukesh 4096 Oct  7 21:14 xab
-rw-rw-r-- 1 mukesh mukesh 4096 Oct  7 21:14 xaa

6) Split file with 2 files of equal length:

We can use -n option in place of -l as following to achieve specific number of file of same records.
[~]$ split -n 2 -d testfile.csv NEW

[~]$ ls
NEW00  NEW01  testfile.csv

[~]$ cat NEW00 | wc -l
670

[~]$ cat NEW01 | wc -l
672

[~]$ cat testfile.csv | wc -l
1342

In above example the expected count should be 671 into each NEW00 and NEW01 but its not. If anyone could explain me it would be appreciated.

***End***

What is Dark Web , Deep Web And Surface Web ?

Friends, We all use internet daily. Did you you know that there is a very big section of internet which you would never use? Today I will tell you about those sections which are Deep Web and Dark Web.
Internet is divided in to 3 sections or you can say layers, Surface Web, Deep Web and Dark Web.
Every one of you might know about Surface Web Internet, if not I will tell you about it today.


SURFACE WEB

It is the internet which is used most commonly by everyone in the world, and will be using in future and even which I use.That is called as Surface Web, in fact this video which you are watching now is also a part of Surface Web. Surface Web is the internet which is accessible anywhere in the world without any special permission, And Information about Surface Web is easily accessible in a simple Google search. That means all the information you get in Google search results such as links
or websites are all part of Surface Web because it is publicly available for everyone. For example, all your Entertainment Websites, News Websites, Music Websites, Torrents, and all the information you were using till today is all part of Surface Web. But do you really know that Surface Web is only 5% of actual internet? According the recent studies nearly 95% of internet is on Deep Web, if you compare Deep Web to Surface Web with an Ice Berg then Surface Web is just a tip which is seen in the sea which you were using it till now and will be using in future. But actual mass below is the Deep Web.

DEEP WEB :

Now let’s see what happens in Deep Web and what is stored in it. All online storage like Google Drive, Dropbox, Big University Documents, Research Data, Big Companies and Banks Information & Databases, Government Secret Projects & Files or basically the data which you will never get with a simple Google search are stored in Deep Web. If you want to access Deep Web, you will need a special URL, special address for website or server and you need permission to access that information. It might be a login id and password or any type of authentication, but without that address you will never be able to access the Deep Web. All the websites or web pages, data, information are not indexed by Google, Yahoo or Bing search engine, so in that case you will never be able to see what is stored in my normal Google drive or cloud storage even if you keep searching for it forever. It is accessible only if you have permission for that particular file or folder when shared. We need Deep Web to store all our data to cloud but don’t want to share it with everyone and only to the eligible ones. Its as simple as a company intranet page whereas project document would be shared only with the selected users or employees , but not with all employees and the same is not available on internet search. Deep Web is anyways not a hidden Internet all together, But there is another layer of internet which is hidden from everyone and that is called as Dark Web.

DARK WEB

This is just information we are providing about Dark Web, I request you or either warn you that usage of Dark Web is completely illegal, so never use it. Dark Web is the internet where you can do anything, for example Drugs Trading, Drugs Smuggling, Arms Trading, Arms Smuggling, all other illegal activities which we don’t want to discuss about. We just want to inform you that Dark Web is a Black Stop for Internet where you can do anything and everything, and also this is never shown in any simple search results. If you want to use it, then you will need a special browser called as Onion Router also known as Tor Browser. Dark web can be accessed using Tor Browser through a VPN, but I remind you that this is completely illegal. All the black market can be accessed in Dark Web since it uses Tor Browser which will be bouncing your address all over the world with different notes, only those users can trace you. Currently there is no back trace available for Tor Browser. It was invented for US Navy but nowadays Tor Network is there all over the world connecting entire black market
and many agencies are working to take down the websites in Dark Web as well. Even after all these corrective & preventive techniques, Dark Web still exists in the current Internet and there are lot of
illegal stuff and activities happening which is accessible only on Tor Browser.


**** END ****
Related Posts Plugin for WordPress, Blogger...