There is a lot of information available about files under Linux, and tools like ls only paint a small part of the picture. The stat command can be used to obtain all kinds of metadata about file-like objects including directories, and is very useful when scripting.
In its simplest form, stat will produce output like the following:
$ stat .bashrc
Size: 4910 Blocks: 16 IO Block: 4096 regular file
Device: 801h/2049d Inode: 9571599 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 1016/ crb) Gid: ( 1016/ crb)
Access: 2019-01-29 11:38:03.367415065 +0000
Modify: 2018-01-22 12:39:32.643265219 +0000
Change: 2018-02-15 17:56:48.220205052 +0000
This shows a huge amount of metadata for the .bashrc file in my home directory, which I use to configure my bash shell. This tells me:
- File: .bashrc tells us that the file name is .bashrc , which is exactly what we passed to stat on the command-line. It’s unlikely to ever be different to what you asked except when inspecting a symbolic link, when the field will include the destination of the link.
- Size: 4910 is the size of the file in bytes. Note that directories usually also have a size which corresponds to the fact that they store information about the files stored within them—usually the very information that stat is displaying.
- Blocks: 16 is a field in a slightly archaic format. It corresponds to how much space on the filesystem this file takes up in multiples of 512 bytes. In this example my .bashrc file apparently takes up 8,192 bytes (exactly 8K) of my filesystem partition because my filesystem actually uses 4K blocks to store data. Note that this figure is not always accurate due to file system implementation details.
- IO Block: 4096 is a hint to applications that the most efficient access through the filesystem is performed in blocks of 4096 bytes. Larger requests will probably be split internally into requests for 4K at a time, and smaller requests may end up actually reading or writing 4K by the time it reaches the underlying disk device.
- regular file tells us that this is a plain old file, as opposed to a directory, symbolic link, or more obscure file-like objects such as sockets (for example /run/dbus/system_bus_socket), block devices (e.g. /dev/sda), character devices (e.g. /dev/null), or “named pipes” (FIFOs, e.g. /run/systemd/initctl/fifo).
- Device: 801h/2049d usually corresponds to a device file in /dev or an entry in /sys/dev/block when the filesystem has an underlying device. See below for a discussion of the Device field.
- Inode: 9571599 is a number that refers to information about this particular file, where it is stored and its attributes. The Device and Inode fields together should always uniquely identify a file or directory: one file may be hard-linked and appear in more than one place in the same file system, in which case both file names will have the same Device and Inode numbers.
- Links: 1 suggests that this file has no hard links elsewhere in the filesystem. The number rises for each hard link present in the file system, but not symbolic links. When a file is deleted the underlying system call used is unlink() which just removes the link between the name and the inode: if the link count reaches zero then the inode itself is removed, freeing up space in the file system. For directories, the link count is usually the number of files that it contains, including the special . and .. directories. Look out for a future Technical Tip about inodes, hard links and multiply linked files.
- Access: (0644/-rw-r--r--) are the traditional Unix file system permissions, displayed first in octal and then in a “human readable” symbolic format.
- Uid: ( 1016/ crb) denotes the file owner. This is always stored only in numeric form in the filesystem, but stat translates that to a user name for us to make it more readable.
- Gid: ( 1016/ crb) similarly denotes the group owner of the file. Like the Uid, this is stored only as a number in the filesystem but is translated for us.
- Access: 2019-01-29 11:38:03.367415065 +0000 is traditionally the last time the file was accessed for reading. Using stat should not change this value, but viewing the contents of the file in an editor—or in this case starting my shell which reads this file—should update this time stamp. In modern Linux systems the access time, or atime, is not always changed every time a file is accessed, but the rules governing exactly when it changes are complicated and will be the subject of a future Technical Tip.
- Modify: 2018-01-22 12:39:32.643265219 +0000 is usually the time that the file or directory contents was last changed, but not necessarily properties such as its permissions or ownership information.
- Change: 2018-02-15 17:56:48.220205052 +0000 is the time that properties of the file were changed, including ownership, permissions, or file contents. Note that the atime and mtime properties can be changed to arbitrary values by programs such as touch, but the ctime is deliberately impossible to set to an arbitrary value without resorting to drastic measures such as changing the system clock or manipulating internal filesystem structures.
- Birth: - is designed to track the time at which a file was first created, but this field is not currently available on Linux. Note that many file system implementations on Linux actually track file creation time, but there is simply no standardised way to query this information yet.
Internally, the stat command is a fairly thin wrapper around the stat() system call. You should be able to replicate its functionality in your preferred scripting or programming language quite easily without having to call the stat command itself in most cases.
Scripting with stat
stat can be used to extract individual fields from files in a specific format, which is very handy for scripting. For example, to extract the fields that ls -l shows by default in a more scripting-friendly format, one could use the following:
$ stat -c '%#a %h %U %G %s %Y' .bashrc
0644 1 crb crb 4910 1516624772
This gives us the permissions in octal (with the customary leading zero), the number of links, user and group ownership, size in bytes, and the modification time (as a Unix time stamp, measured in seconds since midnight on the 1st of January 1970 UTC), all separated by spaces which is easily split in a shell script. The stat(1) man page lists all the possible fields and options.
When a file system is mounted from a block device, such as /dev/sda1 in the example above, the Device field will correspond to the device the file system was mounted from. Not all file systems are backed by a block device, for example /sys or /proc , in which case the Device field is meaningless but will be consistent among all files in the filesystem.
The device number is a two-byte (four-digit hexadecimal) number made up of the major and minor numbers of the underlying device. The simple example above ( 801h or 0x0801 ) translates to major 8 minor 1, but device fe0dh is more complex: 0xfe is the major number (254 in decimal) and 0x0d is the minor number (13 decimal). Translates to the following device on my particular system:
$ ls -l /dev/dm-13
brw-rw---- 1 root disk 254, 13 Jan 29 11:38 /dev/dm-13
$ ls -l /sys/dev/block/254:13
lrwxrwxrwx 1 root root 0 Jan 29 17:49 /sys/dev/block/254:13 -> ../../devices/virtual/block/dm-13
More about stat
In this Technical Tip we’ve only scraped the surface of the stat command. We’ll be covering more of it in a future Tip.