JA1. Learning Journal 1¶
Statement¶
Your learning journal entry must be a reflective statement that considers the following questions:
1. Describe what you did¶
This was the first week of this course, and I was excited to start learning about information retrieval. I started by exploring the course syllabus, and participating in the general forum discussion. Later, I started this week’s reading and then I tried the quiz, then continued reading, and then did the discussion and journal entries.
2. Describe your reactions to what you did¶
Honestly, I was surprised with the course contents; I though it would be just another databases course which speaks about the best practices of how to storing and retrieving data from databases; but I was surprised to find that it is more than that, and it is about how to build a search engine, and web crawling.
I fee excited about the topics discussed in the course were always a black box to me, pluse the skill of how to navigate and search files and store them properly is a valuable skill for every software engineer.
3. Describe any feedback you received or any specific interactions you had. Discuss how they were helpful¶
I did not receive any meaningful feedback that I can discuss here.
4. Describe your feelings and attitudes¶
The language of the book was interesting, but it contained a lot of new things that took me long time to understand; which caused me to not finish the reading assignment; but I’m planning to put more time this weekend to finish it before starting the next week’s reading.
The discussion assignment was helpful in emphasizing the difference between phrase queries, boolean, and proximity queries; it also asked about to compare inverted, positional, and biword indexes; which forced me to read these topics more than once and build a solid knowledge about them.
5. Describe what you learned¶
The first chapter (Manning et al, 2009, Boolean Retrieval) was a great introduction to the course; it introduced the problem of information retrieval through the example of Shakespeare’s plays; and then went into defining the idioms and words used int the field, and then moved to define the boolean retrieval model, and described some data structures that can be used to implement it (e.g. inverted index, postings list, etc).
The second chapter (Manning et al, 2009, Terms and Postings) dug deeper into the term vocabulary and postings lists; topics like tokenization, , normalization, stemming, lemmatization, stop words, phrase queries, positional indexes, and biword indexes were discussed in this chapter.
6. What surprised me or caused me to wonder?¶
Although the inverted index was introduced 40 or 50 years ago, it is still the most common way to implement search engines; despite the growing size of the web and the increasing number of users; despite some newer techniques may be more efficient; but the fact that it is still the most common way to implement search engines is surprising to me.
I think inverted indexes are still used because efficient data structures are used to maintain the posing lists, these structures may include fixed-size array, dynamic arrays, linked lists, skip lists, and balanced trees; and where each of these shines at certain use cases; the proper selection of the right structure was a key factor in the success of the inverted index.
7. What happened that felt particularly challenging? Why was it challenging to me?¶
Almost all concepts in the domain of information retrieval are new to me; which made it challenging to understand the reading assignment without the need of stopping and doing some research about what I’m reading; but I think this is normal for the first week of a new course.
8. What skills and knowledge do I recognize that I am gaining?¶
On top of everything mentioned above; I think I’m gaining closer understanding of how to solve the search problem and closer to understand the internal structure of database management systems and why some of them behave the way they do. Although the course is not about databases, but I think I constantly keep comparing what I’m learning to what I already know about databases.
9. What am I realizing about myself as a learner?¶
I’m realizing that I’m a slow learner; and that I need to put more time to understand the concepts discussed in the course; and that I need to be more patient with myself.
10. In what ways am I able to apply the ideas and concepts gained to my own experience?¶
I’m a web developer, we use DynamoDB which a non-structured key-value store; which does not support complex querying and searching operations; thus, we have to implement these operations ourselves; and I think the concepts discussed in this course will help me to do so.
References¶
- Manning, C.D., Raghaven, P., & Schütze, H. (2009). An Introduction to Information Retrieval (Online ed.). Cambridge, MA: Cambridge University Press. Available at https://nlp.stanford.edu/IR-book/pdf/01bool.pdf