SichGate

3 months ago

https://t.co/Ie4JGghNhA

0

3

1

0

199

19 days ago

0

16

19 days ago

Attending TechEx North America (May 18–19, San Jose) with CIOs, CTOs, RPA & AI leaders, exploring intelligent automation, cybersecurity, enterprise AI & data at scale. Come speak with SichGate team. https://t.co/RuOTdcPBOE #TechEx #TechConference #TechExNorthAmerica

1

0

50

about 1 month ago

We're at Web Summit Vancouver next week. If you are deploying into highly regulated industries, come find us to talk about your security posture.

sichgate's tweet photo. We're at Web Summit Vancouver next week.
If you are deploying into highly regulated industries,
come find us to talk about your security posture. https://t.co/d6DIzDj0ud

0

1

0

109

about 1 month ago

@theagenticmind who's red-teaming the SLM after quantization for edge? that's where public sector deployments get interesting. https://t.co/mqwvvs8AF3

0

13

sichgate retweeted

about 2 months ago

Looking for a CTO to join SichGate as a technical co‑founder. Prefer AI/ML and/or security engineering background. https://t.co/lN9TLmwHS0

MoshenetsPolina's tweet photo. Looking for a CTO to join SichGate as a technical co‑founder. Prefer AI/ML and/or security engineering background. https://t.co/lN9TLmwHS0 https://t.co/LZEZEpzvyC

2

4

1

0

374

sichgate retweeted

about 1 month ago

One of the clearest lessons from my SLM adversarial evaluation: Fine-tuning shifted the attack surface. It did not reduce it. MedGemma-4B improved on exactly one safety dimension after medical fine-tuning. It also incurred 8 critical demographic bias findings in pain assessment and mental health. The exact domains the fine-tuning was supposed to improve... Parameter count isn't a safety proxy

MoshenetsPolina's tweet photo. One of the clearest lessons from my SLM adversarial evaluation: Fine-tuning shifted the attack surface. It did not reduce it.
MedGemma-4B improved on exactly one safety dimension after medical fine-tuning. It also incurred 8 critical demographic bias findings in pain assessment and mental health. The exact domains the fine-tuning was supposed to improve...
Parameter count isn't a safety proxy

1

2

1

0

202

sichgate retweeted

about 1 month ago

int4 quantization of a safety-tuned model is not a neutral operation. We keep finding cases where the quantized version has a meaningfully different attack surface than the original. Not always worse, sometimes just different in ways that weren't evaluated. @sichgate

MoshenetsPolina's tweet photo. int4 quantization of a safety-tuned model is not a neutral operation. We keep finding cases where the quantized version has a meaningfully different attack surface than the original. Not always worse, sometimes just different in ways that weren't evaluated. @sichgate https://t.co/ZoKTWDZk8V

0

3

1

0

155

sichgate retweeted

2 months ago

Hot take: most "safe" fine-tuned models in healthcare and finance haven't been adversarially tested. They've been vibe checked. Open-sourcing part of our red-teaming methodology from the research https://t.co/Ha8FyNGXmk

1

4

1

0

157

2 months ago

Read more: https://t.co/LtVZCWnA6J

0

26

2 months ago

April fools joke: small language models deployed in healthcare are thoroughly tested for adversarial vulnerabilities before going live. (they are not. we checked. 924 times today)

1

2

0

35

sichgate retweeted

2 months ago

I spent a few months adversarially testing the small language models deployed in hospitals and financial systems. The largest model failed most & smallest failed least. the medical model had the worst bias scores, in the exact domains it was fine-tuned for. 5/6 broke under a normal conversation. The field is studying the wrong models... preprint soon.

MoshenetsPolina's tweet photo. I spent a few months adversarially testing the small language models deployed in hospitals and financial systems.
The largest model failed most & smallest failed least. the medical model had the worst bias scores, in the exact domains it was fine-tuned for. 5/6 broke under a normal conversation. The field is studying the wrong models...
preprint soon.

0

4

1

4

161

2 months ago

@MoshenetsPolina 🌷 ☕️ 💻

0

1

0

18

2 months ago

The smaller the model, the more people trust it without checking. no idea why this is. Quantization does weird things to alignment. weird as in “the safety behavior just kind of disappears.”

0

23

2 months ago

SichGate exists to advance the science of AI red teaming for the systems that matter most. We find vulnerabilities, publish findings, and build open methodology. The field is moving faster than its safety knowledge. Responsible innovation means understanding what you've built before it reaches the people it's meant to serve.

0

20

2 months ago

SichGate is now live. It's the first adversarial ML security lab built specifically for small language models. We test the attack surface of the models you've built and deployed before they go into healthcare, financial systems, and other highly regulated industries.

1

2

0

24

2 months ago

Nobody thinks about what quantization does to the failure modes in SLMs. Everyone thinks about what it does to performance.

0

1

0

16

2 months ago

We tested a 1.1B medical model. 11 critical findings. Safe messaging failures, demographic bias in clinical assessments, safety guardrails that degraded across conversation turns. BUT, this model had passed internal review.

0

1

0

14

2 months ago

The security research field has spent years studying models that have the largest safety teams in the world. The models actually running in hospitals and financial systems are 1–3B parameters, fine-tuned, quantized, and tested by almost nobody.

0

1

0

10