[comp.unix.questions] Listing files bet. two specified dates

manoj@hpldsla.sid.hp.com (Manoj Joshi) (10/23/90)

When I do an 'ls -l' or 'll', the eight column gives either the Year
or the time, viz., 


drwxrwxrwx   8 219      users       1024  Mar 14  1990 cp
drwxrwxrwx   3 219      users       1024  Apr 10  1990 fnguyen
drwxrwxrwx   5 219      users       1024  Apr 19  1990 lp
drwxrwxrwx   2 manoj    users       1024  Oct 19 16:26 editors
drwxrwxrwx   2 manoj    users       1024  Oct 22 09:36 1000


I want to write an awk script that reads YYMMDD for a file. Now,
how do I know if the field 8 is the time and not the year? Will I be
forced to write a C program that will go and read the inode information.

Alternatively, has anyone written a utility that will list out all
files that have their modification times between two specified times.
for eg., all files between Oct 1 1990  and  Oct 10 1990. 

Thanks.
Manoj Joshi.

davidsen@sixhub.UUCP (Wm E. Davidsen Jr) (10/24/90)

In article <9220005@hpldsla.sid.hp.com> manoj@hpldsla.sid.hp.com (Manoj Joshi) writes:

| Alternatively, has anyone written a utility that will list out all
| files that have their modification times between two specified times.
| for eg., all files between Oct 1 1990  and  Oct 10 1990. 

  The 'le' utility, posted to the net some time ago, has an option -f
which causes dates to be output YYYY-MM-DD HH:MM for sorting. It was
intended to help with sorting, but would make your task a bit easier,
too.

  You can use awk, get the field using substr rather than the field
number, and use the 'split' function to determine if the value is a
time, and if so put the year in place of the time. This is an uglier
solution, but it uses all portable tools.
-- 
bill davidsen - davidsen@sixhub.uucp (uunet!crdgw1!sixhub!davidsen)
    sysop *IX BBS and Public Access UNIX
    moderator of comp.binaries.ibm.pc and 80386 mailing list
"Stupidity, like virtue, is its own reward" -me

rice@dg-rtp.dg.com (Brian Rice) (10/25/90)

In article <2162@sixhub.UUCP>, davidsen@sixhub.UUCP (Wm E. Davidsen Jr) writes:
|> In article <9220005@hpldsla.sid.hp.com> 
|> manoj@hpldsla.sid.hp.com (Manoj Joshi) writes:
|> | Alternatively, has anyone written a utility that will list out all
|> | files that have their modification times between two specified times.
|> | for eg., all files between Oct 1 1990  and  Oct 10 1990. 

The hardest part of this problem is the date arithmetic.  If the 
problem were to print all files modified between, say, 10 and 40 
days ago, the solution would be simple (I've tested this to some 
extent, but it's still more or less off the top of my head):

#!/bin/sh
# mbetween path bound_1 bound_2
# Argument 1 is taken to be the root of the directory tree to 
# search.  Argument 2 is one bound (in days), and argument 
# 3 is the other bound (in days).

find $1 -mtime -$2 -print > /tmp/mbetween.$$
find $1 -mtime -$3 -print >> /tmp/mbetween.$$

sort /tmp/mbetween.$$ > /tmp/mbetween2.$$
uniq -u /tmp/mbetween2.$$

rm -f /tmp/mbetween.$$ /tmp/mbetween2.$$
exit 0

If you had a date arithmetic package (didn't one just get posted
somewhere recently?) which supplied a Julian date function, you 
could provide a front end to mbetween (this is pseudocode):

#!/bin/sh
# better_mbetween path date_1 date_2

mbetween $1 `expr `jdate `date`` - `jdate $2`` \
            `expr `jdate `date`` - `jdate $3``
--
Brian Rice   rice@dg-rtp.dg.com   +1 919 248-6328
DG/UX Product Assurance Engineering
Data General Corp., Research Triangle Park, N.C.

manoj@hpldsla.sid.hp.com (Manoj Joshi) (10/25/90)

I got a lot of interesting mail from people on the net which was 
very helpful. To summarise, the general opinion was that `find` was the
closest way to solve this problem. I found that somehow the mtime
and ctime options of find are not to accurate. Besides, I found that
I had to write a shell script anyway which feeds the dates to 
find. So I decided instead to write a shell script which scans through the
files in a dir, and uses a c program to check if the file matches the
time interval condition. The performance was great, it's lightning 
fast on my 12 mip w/s. Anyway, I have included these here for people's 
curiosity. I can make the script and the program more efficient
but here it is only on an experimental basis. 

With this script I could archive files between two dates. And it is 
accurate about that. It will not be too hard to implement exact times. 
In the long run, I would probably put the whole stuff including listing
the files in the same program.


/*********************************************************************/
/*  EXAMPLE LISTING */
/*********************************************************************/

#!/bin/ksh
# Declaration of variables

hostName=$1
instName=$2
fromDate=$3
toDate=$4
selectList=""
searchDir=$DATA_DIR/$hostName/hpchem/$instName/data
currentDir=`pwd`
cd $searchDir

list=`ls -Rrt`
for currentFile in $list
do
   $currentDir/selectFile $searchDir/$currentFile $fromDate $toDate
   result=$?
   if [ result -eq 0 ]
   then
      selectList="$selectList $currentFile"
   fi
done
echo tar -cvf /dev/update.src $selectList
cd $currentDir


/*********************************************************************/
/*  EXAMPLE LISTING */
/*********************************************************************/


#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/time.h>

int selectFile(file, fromDate, toDate)
   char* file; 
   char* fromDate; 
   char* toDate;
   { /* selectFile */
   struct stat *buf;
   struct tm *timeptr1;
   struct tm *timeptr2;
   time_t ftime;
   time_t ttime;

   int fDate = (int)atoi(fromDate);
   int tDate = (int)atoi(toDate);

   buf = (struct stat *)malloc(sizeof(struct stat));
   memset(buf, '\0', sizeof(struct stat));

   if (stat(file, buf))
      {
      perror("stat");
      return(1);
      };

   timeptr1 = (struct tm *)malloc(sizeof(struct tm));
   memset(timeptr1, '\0', sizeof(struct tm));
   timeptr1->tm_sec = 0;
   timeptr1->tm_min = 0;
   timeptr1->tm_hour = 0;
   timeptr1->tm_mday = fDate%100;
   timeptr1->tm_mon = (fDate/100)%100-1;
   timeptr1->tm_year = (fDate/10000)%100;
   timeptr1->tm_wday = 0;
   timeptr1->tm_yday = 0;
   timeptr1->tm_isdst = -1;
    
   timeptr2 = (struct tm *)malloc(sizeof(struct tm));
   memset(timeptr2, '\0', sizeof(struct tm));
   timeptr2->tm_sec = 59;
   timeptr2->tm_min = 59;
   timeptr2->tm_hour = 23;
   timeptr2->tm_mday = tDate%100;
   timeptr2->tm_mon = (tDate/100)%100-1;
   timeptr2->tm_year = (tDate/10000)%100;
   timeptr2->tm_wday = 0;
   timeptr2->tm_yday = 0;
   timeptr2->tm_isdst = -1;

   if ((ftime = mktime(timeptr1)) == (time_t)-1)
      {
      perror("mktime");
      return(1);
      };

   if ((ttime = mktime(timeptr2)) == (time_t)-1)
      {
      perror("mktime");
      return(1);
      };

#ifdef TEST
   printf("From Date : %s\n", asctime(timeptr1));
   printf("To Date : %s\n", asctime(timeptr2));
   printf("File Date : %s\n", asctime(localtime(&(buf->st_mtime))));
#endif

   if ((buf->st_mtime >= ftime) && (buf->st_mtime <= ttime))
      return(0);
   else return(1);
    
   } /* selectFile */

int main(argc, argv)
   int argc;
   char** argv;
   { /* main */
   if (argc != 4)
      {
      fprintf(stderr, 
      "Usage: selectFile <file> <fromDate(YYMMDD)> <toDate(YYMMDD)>\n");
      exit(1);
      };
    
   if (!(selectFile(argv[1], argv[2], argv[3])))
      {
#ifdef TEST
      printf("%s not selected\n", argv[1]);
#endif
      return(0);
      }
   else return(1);

   } /* main */

merlyn@iwarp.intel.com (Randal Schwartz) (10/25/90)

In article <1990Oct24.181321.23205@dg-rtp.dg.com>, rice@dg-rtp (Brian Rice) writes:
| #!/bin/sh
| # mbetween path bound_1 bound_2
| # Argument 1 is taken to be the root of the directory tree to 
| # search.  Argument 2 is one bound (in days), and argument 
| # 3 is the other bound (in days).
| 
| find $1 -mtime -$2 -print > /tmp/mbetween.$$
| find $1 -mtime -$3 -print >> /tmp/mbetween.$$
| 
| sort /tmp/mbetween.$$ > /tmp/mbetween2.$$
| uniq -u /tmp/mbetween2.$$
| 
| rm -f /tmp/mbetween.$$ /tmp/mbetween2.$$
| exit 0

Long way.  Try:

find $1 -mtime -$2 -mtime +$3 -print

with $2 and $3 being in the right order.  Much faster, and not prone
to phase errors.

I'd give you a Perl solution, but there's no point unless you want
finer granularity than one day.

Just another UNIX hacker,
-- 
/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\
| on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III      |
| merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn |
\=Cute Quote: "Intel put the 'backward' in 'backward compatible'..."=========/

alex@am.sublink.org (Alex Martelli) (10/27/90)

In <1990Oct24.181321.23205@dg-rtp.dg.com> rice@dg-rtp.dg.com (Brian Rice) writes:
	...
>find $1 -mtime -$2 -print > /tmp/mbetween.$$
>find $1 -mtime -$3 -print >> /tmp/mbetween.$$
>sort /tmp/mbetween.$$ > /tmp/mbetween2.$$
>uniq -u /tmp/mbetween2.$$
>rm -f /tmp/mbetween.$$ /tmp/mbetween2.$$

I don't see how this can work as is (to follow this approach, you'd have
to place the sorted outputs of the two find's into two separate files,
then use comm on these files, etc), and anyway this is simpler:

	find $1 -mtime +$2 -mtime -$3 -print

since '+$2' means 'MORE than $2 days ago' just as '-$3' means 'LESS
than $3 days ago'.  Anyway, I agree that the date arithmetic's harder.
-- 

Alex Martelli - (home snailmail:) v. Barontini 27, 40138 Bologna, ITALIA
Email: (work:) staff@cadlab.sublink.org, (home:) alex@am.sublink.org
Phone: (work:) ++39 (51) 371099, (home:) ++39 (51) 250434;