Research Proposal

Assignment Description:

Before scientists start experiments or adventurous individuals try to take over the world, they need money. Things like beakers, test tubes, and even sharks with lasers all cost money. Plus, there are lab assistants, henchmen, and rent to pay. That’s why scientists are always looking for funding. They keep writing and rewriting proposals because without money, there’s no science. Understanding who will read your proposal is really important. For example, a science journal won’t accept studies about robots, and the government won’t fund research on robots if they’re looking into cancer. Your proposal needs to show that your work is important and makes sense, but it shouldn’t be too complicated like a lab report.

Research Proposal:

Mohammed Nazib Hossain Khan, Kahyaam Rahim Chaudhary, Shi Cheng Huang

Serhiy Metenko

ENGL 21007

Optimizing Search Speed: A Computational Study

Group Name: Fibonacci Group

University: The City College Of New York (CCNY)

Project Dates: 3/28/2024 – 4/18/2024

Budget: $3400

Abstract: 

Search engines have been used very commonly in recent society for answering peoples’ questions. This research aims to show which type of combination of search terms, an algorithm that plays a role in search, will lead to the desired results. The method that will be used for this research is to have the electronic device with an internet connection to test using different search terms in the search box of the search engine, switch the order of them, and record data with the amount of the results that come out from the search engine for each attempt, and evaluate them at the end. The expected result is that entering specific search terms will provide fewer results, which will be the direct and desired hits, and vice versa. The expected conclusion is that learning the skills of using the search terms in the search engine will help people to find an answer to their query with much more efficiency.

Introduction: 

Search engines within the context of advanced information retrieval systems are resources that facilitate access to extensive data from diverse sources. It is easier to make information retrieval operations and user experiences better by improving search algorithms for speed. For this reason, computer science research seeks mainly to improve query response time.

Among the major issues in search engine optimization is finding fast and accurate search results, particularly with large databases and complex queries. The Internet has become increasingly complex while generating massive amounts of data, posing significant challenges to conventional search algorithms resulting in research into new ways of improving search outcomes.

The primary aim of this project is to investigate algorithmic improvement as a way of increasing the speed of searching. By trying different combinations using keywords related to one topic on Google, and then analyzing the results obtained, we can find out how searches could be made more efficient and accurate. This study may contribute to making browsing easier for users and searching faster for them by advancing the technology used in search engines.

The effects of increasing search speed are discovered in several areas such as AI, data analytics, and information retrieval. By enhancing the efficiency and responsiveness of search engines, we can accelerate decision-making processes, streamline workflows, and create room for cutting-edge research and development. Consequently, this initiative seeks to improve the technology of search engines to handle an unsolved fundamental problem within computational science.

Literature Review: 

In their paper titled “Deep Learning Approaches for Faster Database Searches,” Dr. Chen Wei, Dr. Zhang Shuai, and Dr. Hong Li, who are Research Scientists in the Software and Computational Systems Program, share new ways to make searching in databases quicker. They suggest using smart computer systems called neural networks to help find information in a database faster. These systems learn from past searches to predict the best ways to search. The authors’ study proves this approach can make searches much faster and use less computer power. Because Chen, Zhang, and Li are experts in databases, their ideas are trustworthy. Applying their methods could help our project make searches faster and easier.

The author of “Does Google Shape What We Know?” is Ralph Schroeder, who is a senior research person at the Oxford Internet Institute, University of Oxford. The search engine of Google has shaped the life that people currently have by having the ability to search on the internet using the search terms to get people’s questions answered from the internet. People will use the search engine to search for information on leisure or other content that they want to know. This source can be used for our proposal experiment because the search engine Google has been used often in recent society, and it has shaped everyday life with technological advancement. Therefore, we need the skills of using the search terms to get the desired result for the quickest time in everyday life depending on the search engine for finding information.

Also, it is the journal article that was used as the motivation for our chosen experiment  (project) from the Science Buddies website. The topic that we want to experiment on is the algorithm behind the searches in the engine, with an accurate and efficient result based on the correct combination of the search terms. It also provides the basis of the algorithm for searches. Having different search terms will lead to different results from the search engine. Many of the steps of our method for this experiment are similar to the Science Buddies website. 

Sculley, D., Holt, G., Golovin, D., and colleagues explored the concept of “Hidden Technical Debt in Machine Learning Systems,” shedding light on the often overlooked complexities inherent in optimizing machine learning algorithms (“Hidden Technical Debt in Machine Learning Systems”). While their focus is on machine learning, their findings underscore the broader challenges of algorithmic optimization and the trade-offs involved in system design. 

Our project proposal will be enriched by this research as it gives a more nuanced understanding of the challenges that are inherent to optimizing search engines. This will mitigate risks associated with scalability, efficiency, and maintainability by recognizing and dealing with possible technical debts in our algorithmic designs. Additionally, knowing the tradeoffs involved in algorithmic optimization enables us to make informed decisions that optimize for both speed and reliability during database searches. 

Drawing parallels between machine learning systems’ challenges and database search optimization demonstrates that algorithms are widely interdisciplinary. Given search optimization, Sculley et al. ‘s study provides useful references on how we may improve our designs for better results. 

Project Narrative:

This experiment consists of a total of nine steps. 

Step 1: Familiarize ourselves with Google and read the Google search basics (basic search help) page. 

Step 2: Think of a topic to search for and make a possible list of words that describe it. For example, if the topic is different kinds of fruits, the list will be (fruit, apple, banana, orange, etc.). 

Step 3: Make a data table in the lab notebook to write down each result. 

Step 4: Go to the Google homepage. 

Step 5: Choose search terms for the topic and type each into the search box, trying different combinations of terms and strategies. 

Step 6: Click the “Google search” button. On the result page, look at the near top for the number of hits and write that number in the data table. 

Step 7: Repeat the same process for the other search terms, writing down the number of hit changes in the data table. 

Step 8: Make a bar graph for the data on graph paper or a website. Make a scale for the number of hits on the y-axis. Draw a bar for each search term, matching it with its number of hits on the scale. Make sure that the scale is big enough to fit all the data. 

Step 9: Find out which data had the highest number of hits and which had the smallest. Also, find out how these data relate to the search quantity and how different search terms compare to search terms with different combinations and strategies.

Personnel:

All group members are students at the City College of New York in Manhattan. In the writing part, MK was responsible for writing the title page, introduction, and budget along with the literature review. KC was responsible for writing the project narrative, personnel, and literature review. SC was responsible for writing the abstract, literature review, and timeframe. As for the experiment part, MK is responsible for setting up the experiment. KC will be responsible for choosing the topic and collecting the data. SC will be responsible for comparing the data with different data combinations. 

Budget:

List of Materials Needed:

Computers (Quantity: 3, Cost: $800 each): Three computers will be required for conducting experiments, implementing algorithms, and collecting experimental data. Each computer is estimated to cost $800, totaling $2400.

Software Licenses (Quantity: 1, Cost: $500): A software license is necessary for accessing advanced algorithms, programming tools, and data analysis software. The estimated cost for a single license is $500.

Datasets (Quantity: 2, Cost: $200 each): Real-world datasets are essential for benchmarking search algorithms and evaluating their performance. We plan to acquire two datasets, with an estimated cost of $200 each, totaling $400.

Miscellaneous Supplies (Cost: $100): Miscellaneous supplies such as notebooks, pens, and other supporting materials for documentation and experimentation will be needed. We estimate the cost of miscellaneous supplies to be $100.

Explanation of Material Use:

Computers: The computers will be utilized for running simulations, implementing algorithms, and collecting experimental data. Their processing power and memory capacity are crucial for conducting rigorous experiments and achieving reliable results.

Software Licenses: Access to specialized software tools and algorithms is essential for developing and testing search algorithms. The software license will enable us to leverage advanced functionalities and optimize search performance.

Datasets: Real-world datasets provide the foundation for evaluating search algorithms and assessing their performance under various conditions. By utilizing diverse datasets, we can ensure the robustness and generalizability of our findings.

Miscellaneous Supplies: Supporting materials such as notebooks, pens, and other office supplies are necessary for documentation, organization, and communication throughout the project. These supplies will facilitate efficient project management and ensure thorough documentation of experimental procedures and results.

Total Budget Estimate: $2400 + $500 + $400 + $100 = $3400

This budget allocation covers the necessary materials and supplies for conducting the proposed project on optimizing search speed through algorithmic optimization.

Timeframe

3/53/63/73/123/133/143/153/163/173/183/193/203/213/223/233/243/253/26
TasksStartEnd
Phase 1: Proposal
Topic Selection3/53/7
Research and Writing Proposal3/123/21
Revision3/213/26
3/293/303/314/14/24/34/44/54/64/74/84/94/104/114/124/134/144/154/16
TasksStartEnd
Phase 2: Lab Report
Collecting Data3/294/2
Evaluating Data4/34/4
Drafting Lab Report4/5/4/9
Finalizing Lab Report4/94/16

Works Cited 

Science Buddies Staff. “Ready, Set, Search! Race to the Right Answer.” Science Buddies, 14 July 2023, https://www.sciencebuddies.org/science-fair-projects/project-ideas/CompSci_p015/computer-science/search-speed. Accessed 21 Mar. 2024. 

Chen, W., Zhang, S., & Li, H. “Deep Learning Approaches for Query Optimization in Database Systems.” Accessed March 20, 2024, https://dblp.org/db/journals/tkde/tkde34.html 

Schroeder, Ralph. “Does Google Shape What We Know?” Prometheus, vol. 32, no. 2, 2014, pp. 145–60. JSTOR, https://doi.org/10.1080/08109028.2014.984469. Accessed 25 Mar. 2024.

Sculley, D., Holt, G., Golovin, D., Davydov, E., Phillips, T., Ebner, D., Chaudhary, V., Young, M., Crespo, J. F., & Dennison, D. (2015). “Hidden Technical Debt in Machine Learning Systems.” Advances in Neural Information Processing Systems, 28, 2503-2511. Retrieved from https://papers.nips.cc/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf 

Research Proposal Draft

Optimizing Search Speed: A Computational Study

Group Name: Fibonacci Group

University: The City College Of New York (CCNY)

Project Dates: 3/5/2024 – 3/26/2024

Budget: $3400

Abstract

Searches have been used very commonly in the recent society in the search engine. Using different search terms can lead to a different result from the search engine. Many people have used the search engine to search for something that they want with the search terms but can’t get their desired result most of the time. The purpose of this research is to show which type of combination of search terms, behind it an algorithm, will lead to the desired results. The method that has been used for this research is to have an electronic device with an internet connection to test using different search terms in the search box of the search engine, switch the order of them, and record data with the amount of the result that comes out from the search engine for each attempt, and evaluate them. The expected result is that entering specific search terms will provide you with fewer results, which will be your direct and desired outcome. The expected conclusion is that learning the skills of using the search terms in the search engine will help you find an answer to your query with much more efficiency.

Introduction

Search engines within the context of advanced information retrieval systems are resources that facilitate access to extensive data from diverse sources. It is hard to make information retrieval operations and user experiences better without improving search algorithms for speed. For this reason, computer science research seeks mainly to improve query response time. Among the major issues in search engine optimization is finding fast and accurate search results, particularly with large databases and complex queries. The Internet has become increasingly complex while generating massive amounts of data, posing significant challenges to conventional search algorithms resulting in research into new ways of improving search outcomes. The primary aim of this project is to investigate algorithmic improvement as a way of increasing the speed of searching. By trying different combinations using keywords related to one topic on Google, and then analyzing the results obtained, we can find out how searches could be made more efficient and a bit wordy accurate. This study may contribute to making browsing easier for users and searching faster for them by advancing the technology used in search engines. The effects of increasing search speed are discovered in several areas such as AI, data analytics, and information retrieval. By enhancing the efficiency and responsiveness of search engines, we can accelerate decision-making processes, streamline workflows, and create room for cutting-edge research and development. Consequently, this initiative seeks to improve the technology of search engines to handle an unsolved fundamental problem within computational science.

Project Narrative:

This experiment consists of a total of nine steps.

Step 1: Familiarize ourselves with Google and read the Google search basics (basic search help) page.

Step 2: Think of a topic to search for and make a possible list of words that describe it. For example, if the topic is different kinds of fruits, the list will be (fruit, apple, banana, orange, etc.).

Step 3: Make a data table in the lab notebook to write down each result.

Step 4: Go to the Google homepage.

Step 5: Choose search terms for the topic and type each into the search box, trying different combinations of terms and strategies.

Step 6: Click the “Google search” button. On the result page, look at the near top for the number of hits and write that number in the data table.

Step 7: Repeat the same process for the other search terms, writing down the number of hit changes in the data table.

Step 8: Make a bar graph for the data on graph paper or a website. Make a scale for the number of hits on the y-axis. Draw a bar for each search term, matching it with its number of hits on the scale. Make sure that the scale is big enough to fit all the data.

Step 9: Find out which data had the highest number of hits and which had the smallest. Also, find out how these data relate to the search quantity and how different search terms compare to search terms with different combinations and strategies.

Personnel:

All group members are students at the City College of New York in Manhattan. In the writing part, M.K. was responsible for writing the title page, introduction, and budget along with the literature review. K.C. was responsible for writing the project narrative, personnel, and literature review. S.C. was responsible for writing the abstract, literature review, and timeframe. As for the experiment part, M.K. is responsible for setting up the experiment. K.C. will be responsible for choosing the topic and collecting the data. S.C. will be responsible for comparing the data with different data combinations.

Budget:

List of Materials Needed:

Computers (Quantity: 3, Cost: $800 each): Three computers will be required for conducting experiments, implementing algorithms, and collecting experimental data. Each computer is estimated to cost $800, totaling $2400.

Software Licenses (Quantity: 1, Cost: $500): A software license is necessary for accessing advanced algorithms, programming tools, and data analysis software. The estimated cost for a single license is $500.

Datasets (Quantity: 2, Cost: $200 each): Real-world datasets are essential for benchmarking search algorithms and evaluating their performance. We plan to acquire two datasets, with an estimated cost of $200 each, totaling $400.

Miscellaneous Supplies (Cost: $100): Miscellaneous supplies such as notebooks, pens, and other supporting materials for documentation and experimentation will be needed. We estimate the cost of miscellaneous supplies to be $100.

Explanation of Material Use:

Computers: The computers will be utilized for running simulations, implementing algorithms, and collecting experimental data. Their processing power and memory capacity are crucial for conducting rigorous experiments and achieving reliable results.

Software Licenses: Access to specialized software tools and algorithms is essential for developing and testing search algorithms. The software license will enable us to leverage advanced functionalities and optimize search performance.

Datasets: Real-world datasets provide the foundation for evaluating search algorithms and assessing their performance under various conditions. By utilizing diverse datasets, we can ensure the robustness and generalizability of our findings.

Miscellaneous Supplies: Supporting materials such as notebooks, pens, and other office supplies are necessary for documentation, organization, and communication throughout the project. These supplies will facilitate efficient project management and ensure thorough documentation of experimental procedures and results.

Total Budget Estimate: $2400 + $500 + $400 + $100 = $3400

This budget allocation covers the necessary materials and supplies for conducting the proposed project on optimizing search speed through algorithmic optimization.

Timeframe

T.S. for Topic Selection

R.W. for Research and Writing Proposal

D.D. for Proposal Draft Due Date