Project Plan Presentation

Lecture Slides: Not Applicable
Useful Liks

Objective

In this week, students will present their project plans related to Natural Language Processing (NLP).
Each student will identify a specific topic in any areas of NLP, define the problem they aim to address, and describe how they plan to solve it using the concepts learned so far.

Learning Goals

Develop the ability to connect theoretical NLP concepts with real-world applications
Practice defining clear research or implementation problems
Present and communicate project ideas effectively
Receive feedback from peers and instructor for refinement

Presentation Guidelines

Topic Selection

Choose one of the following directions (or propose your own):

Text classification (sentiment analysis, topic detection, hate speech detection)
Machine translation
Chatbot or conversational agent
Document summarization
Information retrieval or semantic search
Multimodal NLP (text + image/audio)
Retreival Augmented Generation (RAG)

Presentation Structure

Each presentation may include:

Background & Motivation — Why is this problem meaningful?
Problem Definition — What exactly are you trying to solve?
Proposed Method — How will you approach this? (model, data, architecture)
Expected Outcome — What do you expect to learn or demonstrate?
Challenges & Risks — What could be difficult?

Guidelines

Research scope
- Design your topic so that it can be completed within this semester.
- Keep the scope manageable and focused rather than overly ambitious.
Collaboration scale
- Projects may be conducted individually or in pairs (1–2 members).
- Make sure the workload is appropriately distributed.
Target publication quality
- Aim for a KCI-level journal quality during this semester.
- After the course, students with deeper interest are encouraged to extend their work toward SCIE-level publication through further refinement and experimentation.
Feasibility and sustainability
- Avoid including too many complex objectives in one project.
- A well-defined, achievable goal is better than an overextended plan that risks mid-semester abandonment.
Suggested Materials & Approaches
- All contents of this lecture series
- Relevant NLP papers (ACL, EMNLP, NAACL, etc.)
- Hugging Face Transformers documentation
- Google Research, Meta AI, or OpenAI technical reports
- Any materials related NLP and your own domain

The Top 10 ML & NLP Projects to Level Up

Summary of The Top 10 BEST ML & NLP Projects to help you level up

Writing a Survey Paper

Description: Create an overview of past approaches used for a specific NLP task (e.g., named entity recognition or event detection).

Benefits: Avoid "reinventing the wheel" and gain a deep understanding of existing work. Excellent preparation for technical interviews and can be turned into a strong blog post.

System or Systems Evaluation

Description: Use existing systems to determine how well they work, when they work, and if they are extensible. Focus on evaluating non-leaderboard data.

Benefits: Learn practical skills such as performing API calls, setting up data pipelines, and integrating multiple systems. Helps identify overfitting to specific benchmarks.

Open Source Contributions

Description: Contribute code to existing open-source projects. Start with “open issues” on GitHub, especially those tagged for beginners.

Benefits: Your code becomes visible to potential employers. You gain experience with the software engineering life cycle and receive professional code reviews.

Advice: Begin with smaller, specialized packages rather than massive projects like NumPy.

Replicating a Research Paper

Description: Recreate the results of a published paper using available code and data (check GitHub, Papers with Code, Distill, or contact authors).

Benefits: Learn how to understand and work with other researchers’ code — an invaluable real-world skill.

Add Annotations to an Existing Data Set

Description: Add new labels or annotations to an existing dataset.

Benefits: Learn how to design annotation guidelines, perform labeling, and understand data quirks. Improves your data cleaning intuition.

Ablation Study

Description: Take an existing model and remove (“ablate”) components—layers, weights, or training data—to observe the results.

Benefits: Understand which parts of the model matter most. Reducing unnecessary components can lead to simpler, cheaper models. Gain low-cost experience with ML Ops experimentation.

Extending a Research Paper

Description: Build upon the findings of a published research paper by adding new methods or improving systems.

Advice: Check the paper’s “future directions” for hints. The more you replicate and analyze systems, the more innovative ideas you’ll generate.

Benefits: Often leads to publishable research results.

Collecting a Completely New Data Set

Description: Gather new data for a task, language, or event that lacks an existing dataset.

Complexity: One of the most time-intensive projects.

Benefits: Teaches how to handle raw, messy, unlabeled real-world data — a critical skill in professional work.

Advice: Focus on a very narrow, specific task (e.g., NER in a low-resource language).

Make Something You Wish Existed

Description: Once you’re confident with your skills (data collection, version control, model deployment), build something genuinely useful that you wish existed.

Benefits: Personal motivation ensures you stay engaged. You also produce a tangible, real-world product demonstrating your ability.

Recreate Something in a New Human Language

Description: Rebuild existing tools (e.g., parsers or tokenizers) for an underserved human language.

Complexity: Highly challenging since you may have to build basic NLP components from scratch.

Benefits: Creates meaningful resources for underrepresented linguistic communities. Perfect for multilingual students.

Activity Plan

Step	Activity	Description
1	Topic brainstorming	Students explore interests and discuss ideas
2	Draft proposal	Submit short project abstract
3	Presentation	Deliver presentation in class
4	Feedback	Instructor and peers provide constructive feedback

Evaluation Criteria

Category	Description	Weight
Relevance	Topic clearly relates to NLP	20%
Originality	Creativity and novelty of approach	20%
Feasibility	Realistic and achievable within term	20%
Clarity	Presentation and communication quality	20%
Engagement	Interaction, discussion, and response	20%

Future Plan

After the project plan presentation, each student (or team) will proceed toward a research-oriented mini project, gradually shaping their work into a publishable outcome.

The process is designed to mimic a real academic research cycle — from idea conception to manuscript preparation — within the semester.

Submission Requirements

By the mid-term period, students must submit the following materials:

Abstract

A concise summary of the research motivation, objectives, and expected contributions.
Proposed Paper Structure
- Draft outline of chapters and sections (e.g., Introduction, Related Work, Methodology, Experiments, Conclusion).
- Each section should briefly describe what content will be included.
Implementation Plan
- Specify the core technical components (models, frameworks, datasets, evaluation metrics).
- Identify what tools or programming languages will be used.
Weekly Research Timeline
- A week-by-week schedule up to the end of the semester, showing milestones such as data collection, model training, evaluation, and writing.

Final Deliverable

By the end of the semester (final exam period), each student will submit a mini research paper summarizing the completed project.
This paper should include empirical results, analysis, and clear reflections on what has been achieved and what could be improved.

The final paper will serve as a foundation for a KCI-level journal submission.
Students with strong results or interest may continue the research after the semester and extend it for SCIE-level publication.

End-State Goal

The ultimate goal of this course is not just to complete a project, but to cultivate the ability to conduct independent, publication-oriented NLP research.

Each student should envision their final output as a KCI-ready manuscript,
and take this course as the first step toward contributing to the broader academic community.

← Segmentation & Word Embedding Technologies Mid-term Exam →