feat(dataset): add final smishing rewrites, documentation, and report by s223737886 · Pull Request #46 · Hardhat-Enterprises/smishing-backend

s223737886 · 2025-05-15T09:08:03Z

Summary

This pull request delivers the finalized dataset and supporting documentation for the Smishing Detection backend project, specifically for the Microsoft Planner task titled: Smishing Message Rewriting for Training and Smishing-report and also it contains a report named Smishing-report that explores on how the working of smishing-attack and why they're effective

What’s Included

✅ Final processed dataset located at: machine-learning/datasets/Dataset.csv
- 800 messages: original + rewritten smishing variants
- Linked metadata fields: source, intent_type, malicious, threat_level, linked_to, etc.
✅ Dataset documentation under: machine-learning/projects/DatasetDocumentation
- dataset_schema.md
- rewriting_strategy.md
- smishing_taxonomy.md
- traceability_mapping.md
- preprocessing_guidelines.md
Report under: 'machine-learning/projects/Reports/Smishing_Report.docx
although the report is quite different from the above work but it delves into the working of smishing-attack and why they are so effective

Conventions Followed

Branch: sms-rewriting/kalpna (named per contribution guideline format)
Commit message format: follows Conventional Commits (feat, chore, etc.)
Pull request targets: dev branch (not main)
DatasetCombined.csv was removed as part of cleanup

Notes

GitHub may not allow automatic merging due to upstream changes — please feel free to resolve conflicts manually if required.
This contribution is scoped only to the dataset and does not include model code or frontend tasks.

Planner Task

This PR corresponds to the Microsoft Planner task: Smishing Message Rewriting for Training and Smishing-report
Dataset.csv
dataset_schema.md
preprocessing_guidelines.md
README.md
rewriting_strategy.md
smishing_taxonomy.md
traceability_mapping.md

dec1belPP

Hey @s223737886, there are some changes required before we can review your PR:

Your task's scope is to improve the dataset so you don't need to be having any changes done to any existing JavaScript or other Python files.
Rename the old dataset or keep it as it is instead of deleting it.
Resolve any conflicts manually before putting in your PR.

Please note that your PR will not be reviewed till all of these changes are made. Thank you.

s223737886 · 2025-05-22T10:50:28Z

I have made the required changes and committed the repo again and moreover I have included the report of my pull request 47 into the latest commit changes as it was closed. The report is named Smishing-report.

dec1belPP · 2025-05-23T02:52:46Z

I have made the required changes and committed the repo again and moreover I have included the report of my pull request 47 into the latest commit changes as it was closed. The report is named Smishing-report.

Hey @s223737886, this PR is still not at an acceptable standard for review. To reiterate, please:

You have commited the enitre repo back again. Please sync your local fork and only commit only the files changed/added related to your feature.
Your task's scope is to improve the dataset so you don't need to be having any changes done to any existing JavaScript or other Python files.
Do not delete the old dataset. Please leave it as it is.
Resolve any conflicts locally before putting in your PR.

Please note that your PR will not be reviewed till all of these changes are made. Thank you.

s223737886 · 2025-05-23T07:36:02Z

Thanks for the feedback Pasindu

I've now cleaned the branch and made the following updates based on your instructions:

Retained the original DatasetCombined.csv without any changes.

Added a new file Dataset.csv with the rewritten smishing messages.

Included only relevant changes related to the dataset: documentation (DatasetDocumentation) and the report (Smishing_Report.docx).

Removed all unrelated JavaScript or Python file changes from the PR.

Verified the branch is up-to-date with origin/dev.

Let me know if any other changes are needed. Thank you!

s223737886 added 2 commits April 2, 2025 22:25

Save changes on MongoDB connection issue

1bc370a

Save progress on smishing project

8bc29ea

dec1belPP suggested changes May 15, 2025

View reviewed changes

dec1belPP added the need changes This pull request needs changes before it can be merged. label May 15, 2025

dec1belPP mentioned this pull request May 15, 2025

docs(report): add smishing attack write-up #47

Closed

dec1belPP changed the title ~~feat(dataset): add final smishing rewrites and documentation~~ feat(dataset): improve smishing dataset May 23, 2025

feat(dataset): add final smishing rewrites, documentation, and report

4ad0162

s223737886 force-pushed the sms-rewriting/kalpna branch from 471e726 to 4ad0162 Compare May 23, 2025 07:28

s223737886 changed the title ~~feat(dataset): improve smishing dataset~~ feat(dataset): add final smishing rewrites, documentation, and report May 23, 2025

dec1belPP removed the need changes This pull request needs changes before it can be merged. label May 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(dataset): add final smishing rewrites, documentation, and report#46

feat(dataset): add final smishing rewrites, documentation, and report#46
s223737886 wants to merge 3 commits intoHardhat-Enterprises:devfrom
s223737886:sms-rewriting/kalpna

s223737886 commented May 15, 2025 •

edited

Loading

Uh oh!

dec1belPP left a comment

Uh oh!

s223737886 commented May 22, 2025

Uh oh!

dec1belPP commented May 23, 2025 •

edited

Loading

Uh oh!

s223737886 commented May 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

s223737886 commented May 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What’s Included

Conventions Followed

Notes

Planner Task

Uh oh!

dec1belPP left a comment

Choose a reason for hiding this comment

Uh oh!

s223737886 commented May 22, 2025

Uh oh!

dec1belPP commented May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

s223737886 commented May 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

s223737886 commented May 15, 2025 •

edited

Loading

dec1belPP commented May 23, 2025 •

edited

Loading