Merge Data From 100s of excel files in Bulk Using Python

If you already work or intend to work with excels, the following tutorial would be very useful to you.

Let’s say you prepare expense report for your company, and you log in daily expenses in the sheet. But what if you want to calculate all the expenses for a particular month, or even for a year? You would need to pull up each file, do copy-paste, and it may take hours to do the Job. Now, you can cut down this time to a minute of two. by using Python Modules, and do it all at once. You can even reuse the code and get this menial job done in seconds.

Even if you don’t know anything about Python, or if you are a beginner, don’t worry. I have broken this down in easy steps so that you can understand. If you are an advanced python user, directly go to the code section (Third Step).

Remember, the first time when you set this up, you may find this intimidating, but keep going. This will help you do this tedious task quite easily for the rest of your life. You can use it without any coding experience. I will also record a video for your convenience.

First Step – Install Python
1. Go to Python’s Official Website, and download Python. I recommend downloading version 3.5 or higher.

2. Once you download it, install python, and environment variable for python. I have covered it in this python pip post.

3. Once you set up the variable. Go to your command prompt (windows+r) key, and install the below modules by typing in these commands
python –m pip install openpyxl   (For .xlsx files)
python –m pip install xlrd  (For .xls files)

4. Since you now have Python installed, open IDLE (It is a python shell, where we would be doing all of our task). IDLE looks like this

Second Step – Go to the folder where you have those files

  1. Open the folder where the excel files are located, and note down the directory.
  2. In your python shell (which looks like the picture above), type the following commands after >>> (Don’t leave any spaces)
    import openpyxl
    import xlrd
    import os
    import globe
  3. Now, go to the directory, you do that in python using the below command
    os.chdir(“directory name”). Say if my directory name is C:\Users\Deepak\ExpenseSheet, the code will be os.chdir(‘C:\\Users\\Deepak\\ExpenseSheet’) {Notice the two ‘\’ instead of one. For starters, windows use one \ as a separator and python uses two, so always put two backslashes)

Third Step – Code. 

Now that you are in the directory, you are all set to merge the excel files. But, you may have either .xls or .xlsx files. So depending on the files that you are looking to merge, use the respective codes.

a) If you want to merge .xlsx files

list = glob.glob(‘*.xlsx’)
for key in list:
wb = openpyxl.load_workbook(key)
sheet = wb.get_sheet_by_name(“Sheet1”)
for j in range (1,200,1):
print (j, sheet.cell(row =j, column =2).value)

Note {Column = 2 means 2nd column, as openpyxl in the code starts counting from 1, so if you store expenses in say column 3, then  change the number to 3}

a) If you want to merge .xls files

list = glob.glob(‘*.xls’)
for key in list:
wb = xlrd.open_workbook(key)
sheet = wb.sheet_by_index(0)
for j in range (1,sheet.nrows,1):
print (j, sheet.cell_value(rowx =j, colx =1))

Note{colx = 1 means 2nd column, as xlrd in the code starts counting from zero, so if you store expenses in say column 3, then  change the number to 2}

When you hit enter you will see all the data along with the names in the python shell, Just copy it, and paste it to a new file – which will have all your data.

I hope this was easy, and as with all my posts, if you find any difficulty in using it, drop me a comment and I would be happy to help you. Python is a great language, and I encourage you to learn it.

Easy way to find repetation of words in Excel

“Most humans use 2000 words in a language”, says an old adage. Although I do not agree with it, I still believe that word analysis is essential for everyone.

Suppose you are learning a language, how great would be it if you knew the keywords that most newspapers use. Or, if you are a technical writer who is tasked to write a highly specialized content that demands a specific vocabulary, in such cases, word analysis can help.

Even if you read newspapers, you are likely to find a few words that you may not pay attention to, but if you did, then you could improve your vocabulary. Or say if you are a content head, and want to check content for plagiarism, you could find the frequency of the words used, and make out if the content is plagiarized.

This article will help you understand how this can be done — in simple steps. All you need is Microsoft Excel installed on your computer.

1. Start your excel program.

2. Copy Any string, or piece of string that you want to perform a repetition check for. (I have posted the image just below step 3)

3. Go to “Developer” option on your Excel and click on Macros (As located in the image below)
4. Once you click there, assign macro a name (like, Find String Macro), and click on ‘create’ which is located on the right-hand side of the dialog box.

5. You will see a dialogue box, which would be blank, paste the code below.

Sub Fable()
Dim BigString As String, I As Long, J As Long, K As Long
         BigString = “”
For Each r In Selection
BigString = BigString & ” ” & r.Value
Next r
BigString = Trim(BigString)
ary = Split(BigString, ” “)
Dim cl As Collection
Set cl = New Collection
For Each a In ary
On Error Resume Next
cl.Add a, CStr(a)
Next a

For I = 1 To cl.Count
v = cl(I)
Cells(I, “B”).Value = v
J = 0
For Each a In ary
If a = v Then J = J + 1
Next a
Cells(I, “C”) = J
Next I
End Sub

6. Now, press Alt+F11 or click Excel icon on the top left (below file menu).

7. Select the cells that you want to run text repetition check for, and click on macros again, and select the macro that you just created (with the name you assigned in step 4).

Say you have data in cells A1 to A4, so, you need to first select all the four cells and then click on macro (So that it excel knows where to run the string repetition check).

8. You are done! now you will see two rows, one with the word, and second with the number of repetition of the word in the excel file.

Now, if you are learning a language like Spanish or English,  pick up the words that you don’t know, and put them in your flashcard or some other repetition list.

High Typing Accuracy By Building Auto Correction Tool

If you type for 6 hours a day, with 95{f5ea2038a4b880f335334dfff88b50aec469abded8ad52b137b4eb824479fac2} accuracy, you lose around 15 days in a year, correcting your typos. That’s why you should invest your time in reading this — as it certainly can help you weed out 14 out of those 15 days, like it has in my case.

It’s been a while since my last post, which gives tips on typing over 10k words per day. But since then, I realized that there are quite a few things that can help us type much more — easily and efficiently. By doing this, I have been able to type 20,000 words per day.

Accuracy: The Perennial Foe of Typing Quickly
There are quite a few things that make us recede our typing speed, one of the most notorious things is — accuracy. Suppose, you type the word believe incorrectly as “believe”, in such cases, you would have to type in three additional keystrokes (Four times arrows keys to shift and a backspace)— until you reach and correct it. But, you always don’t type five additional keystrokes, so let’s take a modest estimate of three keystrokes.

It means every time you make a mistake, you press three keystrokes to correct it. Now assuming your typing accuracy is 95{f5ea2038a4b880f335334dfff88b50aec469abded8ad52b137b4eb824479fac2}, you spend 15{f5ea2038a4b880f335334dfff88b50aec469abded8ad52b137b4eb824479fac2} of your time in correcting mistakes — which is huge, by any yardstick. If you type for 6 hours a day, you would end up losing 1 hour every day on correcting a mistake, which translates spending a little over 15 days on correcting your typos. Now, that isn’t all, I have observed that one tends to lose focus and momentum the minute we try to alter our typing position. Therefore, it consumes additional mental resources to correct things.

Many software tools have autocorrected, why should I bother to build my own?
That is a good question, but do you always type in those environments like MS word etc? Well, this isn’t all, try typing in “better ” (With space) on MS word, let’s see if it corrects it, mine wouldn’t! These software care to a wide range of audience, so they lack the sort of personalization that we, individuals, need.

So, how to build your own autocorrect?

Here’s how you can go about doing it — (step by step on Windows Machine)

1. Download the auto hotkey from https://autohotkey.com/

2. Once you are done, install it.

3. Then,  go to any folder (or Desktop), and right click, select new, and then you would see an option called AutoHotkey Script.

4. Now, select that file and click edit, once you do that, it would open in a notepad.

5. Put all the word mistakes that you make, along with the corrected form. Make sure that you use two colons (::) to separate the words, mistakes on the left-hand side, and corrected words on the right-hand side.  Here’s how it should look like. You can add thousands of mistakes as such.

6. Once you are done entering it, you can save and close it (You edit it whenever you want).

7. Now, click on the file again, and it would ask you if you want to run the script, click yes!

Now, you are covered. Be very careful when you write this script and include common words. Notice the second last word in the picture, “very Good”. Now, good is a valid word and I don’t want my spell check to correct it to good everytime I type it, because I may want to thank god! Even if you only cover your vague mistakes, you would be able to achieve over 99{f5ea2038a4b880f335334dfff88b50aec469abded8ad52b137b4eb824479fac2} accuracy, which would save you months of time in the long run. Similarly, you can build such checks for special characters of other languages like French or Spanish.

Tips To Improve Vocabulary in Various Languages

To learn any language, it is essential to master the vocabulary, as it is not as difficult as it may seem to you — as long as you leverage the technology. You simply can’t ignore vocabulary to master any language. These tips hold good for any target language. You would need a few tools like Evernote, a spreadsheet program, like Excel or LibreOffice or Openoffice. Mastering vocabulary in various languages have a few important components — detecting the right words, and then putting in a good system of practice. Let’s go over them so that it can help you — one by one.
  1. Pick up any of your favorite writing source for your target language and copy it. For instance, if your target is to improve french vocabulary, then go to any good source and copy the content to your excel, and find repetitions of words using excel.
  2. Prepare a blank notebook on Evernote (Or Notepad), to keep a record of difficult words, be sure not to keep their definition, because it is important for your brain to make that neuro connection to absorb the meaning.
  3. Use a good flashcard program like Anki, and space your repetitions, for instance, if you plan to master 10 words every day, revise the words twice a day on the first day, third day and the seventh day. This way, you would imbibe the words deeper into your mind. It never hurts to revise the words once in a while.
  4. Once you learn any word, be sure to write it, it can be in your journal or anywhere else, this part is important as you can only learn words better if you can make strong connections (as I mentioned in point 3), so the more intense and variety of connections you make, the strong your connections get to that word.
  5. Try guessing the meanings of words that are in your list – before looking them in the dictionary. If you think more about the word, you might be able to get a good guess at it, and would certainly retain it for a longer period of time. For instance, if you come across a word “agua” in Spanish, before looking up for its meaning (which is water), try to relate it with a similar word, and once you do, you would be able to retain it for a long period of time.

If you leverage technology well, then mastering language vocabulary will get much easier. So, don’t get overwhelmed. instead, get smart and push hard towards your language learning goals.

Why Choose Portuguese or Italian as your 3rd Language

Language learning is extremely rewarding in the sense that it enables you to learn more languages quickly. So, if you are already proficient in either Spanish or French, it is the time for you to increase your language net, and either target Portuguese or Italian as your third language.

WHY CHOOSE PORTUGUESE OR ITALIAN AS YOUR 3RD LANGUAGE?

There are numerous reasons, first, it is financially more rewarding. For instance, if you are a translator, this would give you another language combination. And consider this — if you know two languages, you can work on two combinations, like Spanish <==> English and vice versa, but it you add one more language, you can work on four combinations, and every new language doubles your language combination.

There is a good demand in industry for both these languages. Moreover, it takes about the same time to learn these languages as it does to learn Spanish and French, so you really won’t have to spend a lot of time which you otherwise would in learning a language like Japanese. Moreover, there are good resources available for these languages.

BUT, WAIT A MINUTE, WHAT IF YOU WANT TO LEARN BOTH?

Well, it is possible to learn two languages at the same time. And, I encourage you to take up both. But then, for various reasons, you might in be interested in picking the one. Well, in such cases, pick up the language that is more similar to the one that you have already learned.

For instance, if you have learned Spanish as your second language, then go for Portuguese as your third language. Similarly, go for Italian if you chose French as your second language.

In my case, I decided to pick up both at the same time — and it is fun to learn it this way. There are various good resources available on both of these, but I highly recommend Michel Thomas’s method for beginners and switch to  other language learning resources. As with any language, it is important to read extensively so that you become familiar with the language.

SOME TIPS BEFORE YOU START

Learning languages is fun, but overdoing it can be distressing. Often it is good to learn a language, give yourself a break, and then switch to something else. But it is important to use that break to think about your language goals. It is quite easy to overlook planning, which is essential to learn anything. For instance, if speak Spanish, then going for Italian will probably take twice as much effort as it would master Portuguese. So before you start, look up the conjugations, and word patterns. If you don’t perform this check, then learning your third language can get difficult and intimidating.

Copy-Paste more than One String at a time in Windows

Often, we wish if we could copy more than one piece of string at a time. It would have just saved a lot of time. But alas! our windows OS generally doesn’t give us this capability. If you think it isn’t important, just think for a minute, what if you saw 5 good websites that you want to open separately in a new tab? You would have to do Copy paste, over and over again, that’s tedious.

Control+c (cmd+c) and control+v (control+v) method allows you to copy one string at a time, and that is quite frustrating when you want to copy a lot of strings on the go. The good news is, there is a way to do it — all at once.

This is a very simple process for which, you would need to download autohotkey from https://autohotkey.com. I have used this software to building an autocorrect tool. If you don’t know how to set it up, then refer to the article on autocorrection.  Or, you can see the procedure below.

Once you are set, let’s begin. It is a very simple process which everyone can easily follow.

As soon as you have your script file open up, paste the below code in the notepad

^+1::
Send ^c
ClipWait
Clip1 := ClipBoardAll
return
!1::
ClipBoard := Clip1
Send ^v
return

^+2::
Send ^c
ClipWait
Clip2 := ClipBoardAll
return
!2::
ClipBoard := Clip2
Send ^v
return

^+3::
Send ^c
ClipWait
Clip3 := ClipBoardAll
return
!3::
ClipBoard := Clip3
Send ^v
return

^+4::
Send ^c
ClipWait
Clip4 := ClipBoardAll
return
!4::
ClipBoard := Clip4
Send ^v
return

^+5::
Send ^c
ClipWait
Clip5 := ClipBoardAll
return
!5::
ClipBoard := Clip5
Send ^v
return

^+6::
Send ^c
ClipWait
Clip6 := ClipBoardAll
return
!6::
ClipBoard := Clip6
Send ^v
return

^+7::
Send ^c
ClipWait
Clip7 := ClipBoardAll
return
!7::
ClipBoard := Clip7
Send ^v
return

^+8::
Send ^c
ClipWait
Clip8 := ClipBoardAll
return
!8::
ClipBoard := Clip8
Send ^v
return

^+9::
Send ^c
ClipWait
Clip9 := ClipBoardAll
return
!9::
ClipBoard := Clip9
Send ^v
return

^+0::
Send ^c
ClipWait
Clip0 := ClipBoardAll
return
!0::
ClipBoard := Clip0
Send ^v
return

In the above code, you can hover select the text and press control-Shift-(respective number) to copy, and alt (respective number to paste).

So, for instance, I want to copy “6789” and “1234”, then first I would select 6789, press control-Shift-1, and select 1234 and press control-shift-2. By doing that, I can paste 1234 when I press, alt-2, and 6789 whenever I press alt-1. This code lets you configure 10 keys which you can use to copy paste more conveniently, right from 0 to 9.

This would save you a lot of time, especially if you copy paste a lot — be it on the browser, or excel etc. If you run into any problem, then do not hesitate to drop in a comment, and I would do my best to help you.