[ b / kemono / coomer ]

/kemono/ - kemono.party

Kemono Development and Discussion
Name
Email
Subject
Comment
Verification
File
Password (For file deletion.)

File: 1635804536483.jpg (73.73 KB, 550x367, server_spaghetti_1.jpg)

 No.8965

See here for words on Maintenance and works being performed on the site!

 No.15933

Message from Administration:
Kemono is now unblocked from Verizon, OpenDNS, and various global ISP bans. Additionally, the site should be a bit faster.

Also, If you are using any downloaders, you should no longer need cookies, so feel free to delete them if you'd like.

 No.16342

As you may have noticed, the importer has been disabled. This is due to work being carried out on the site. Expect it to be back up and running when work is concluded. You can start your import now, it will be added to the queue and be imported as soon as the importer is re-enabled.
Sorry for any inconvenience. Thanks for understanding.

 No.16365

Works have been finished and the importer re-enabled. Site should be a bit speedier now too, but please keep your downloading to a reasonable level. Thanks for your patience.

 No.16589

Due to strange behaviour on our data server Hdds, we are doing several checks to make sure it isn't anything we should worry about, to speed things up we are cutting all connections to the data server, cache servers are still operational and if your file was cached then it will be served without a problem, for non cached files you will notice 5XX Type HTTP errors.

We are estimating that the downtime will be around 1-2 days when the checks are running, during this time imports will be put into the queue and will get run after downtime is over, other site functions will continue to work.

 No.16630

Work has now condluded. For more information, see >>16624 as he has a good explanation and some pretty good suggestions for everyone.

 No.17618

So all data servers were just checked by me personally and those with badly behaving hdds got their drives swapped for new ones.

Enjoy the new kemono download speed!

 No.17640

30-60min maintenance work on the backend servers. Some new data might not load for the time being.

 No.17644

Maintenance is finished.

 No.17668

File: 1645664021113.gif (80.22 KB, 220x164, monkey-computer.gif)

The storage array is currently under maintenance. Imports might be halted for the time being.

 No.17688

File: 1645756070014.jpg (230.25 KB, 1907x1079, FEWlb3q.jpg)

Hoo boy, there is no end to these things. Here is a short list that will be followed up with a longer explanation.
Are things fucked? Yes. How fucked? It is fixable, but will take time. Is there data loss? Unlikely, need more time but we have backups.
Will imports work? Yes, I'll make it work in ~12h. How long will it take? Don't know, maybe a week or two. I'll make it so that you are able to get the data somehow.
End of the Month import will run. Just submit your key and once everything is running it will import.


Now the longer explanation while things run in the background.
While the periodic flushing of data from RAID1 to the disk array was ongoing, the disks decided to hard reset themselves to a point where the filesystems are now in a corrupted state.
The corruption seems to only apply to the data that was being flushed to the disks, which is still available on the RAID1 mount. Given the history of this happening before,
alas the filesystem was repairable at that time, this time it was much worse and I refuse to do any modifications to the filesystems/disks. As not to fuck it beyond the current state.
The metadata and data up to the point of the "periodic flushing" seems to be doing fine. The hashes match, the data is readable. This is also the reason why you can view all the data right now.
The disks behave when you read from them at full speed, but god is not on your side when you start writing to them. The CRC errors hint at the backplane shitting itself and leaving skidmarks on the disks.
There is also the possibility of the NCQ(+linux own queue) shitting itself and linux resetting the disk. In my days I experienced a multitude of disks resetting themselves so hard that you would need to physically disconnect and re-connect them, due to fucked queueing.
But in this case the "hard reset" did bring the disks back, so I can only guess that the FW got better or this is a linux thing that is causing problems. And because the drives were recently released, nothing is to be found on mailing lists.

Now, is the data safe? Yes it is. KP data is backed up periodically and the last backup was a bit before the "periodic flushing". The data on RAID1 is currently being backed up. So this is the least of my worries, I hope.

And finally about importing. I will have to juggle a lot, manually prime the cache servers with new data, backup and replicate the data to other servers and a lot more.
BUT I think I will be able to deliver all the data that is being imported until the issues are resolved. I can't promise a smooth ride, but I think I can bridge the issues we have right now.

fml, I just want a comfy ride without any issues, I'm tired. I just want some enterprise grade storage systems with support and not care about anything.
Also, I think I forgot to mention but this is only a data server. Nothing else is going on there. So the other parts are not affected by this.

 No.17787

File: 1646193703592.jpg (370.25 KB, 900x903, EyyBiHkWUAEq1Hp.jpg)

Slowly chugging along. The important data has been copied, delta is being applied. After that even more data integrity validation is needed.
My estimate is end of the week, likely earlier. But it's not like you will notice anything, maybe a small downtime. Well, whatever.

 No.17837

File: 1646422749555.jpg (112.23 KB, 444x460, 1634673234730.jpg)

You fix one thing and then another one pops up. Gotta love the timing.
Nonetheless ~55% of data integrity validation is done. Hey, I did mention that it will be slow.
In regards to the download problems of some files right now… your complaints fall on deaf ears.
I'm only making sure that imports will work. The rest will wait until I'm done with recovery.
And I'm still optimistic that I might be done by the end of the week. No promises.

 No.17885

File: 1646598956733.jpg (83.59 KB, 895x504, cg1.JPG)

All data has been verified and nothing is missing nor is anything corrupt. So one could say we are "done", but I'll postpone it by day.
I've noticed an anomaly that is not dangerous, but rather annoying. I want to figure out the cause of this, before going prod with this.

 No.17908

File: 1646675665336.png (1.11 MB, 822x1504, 1640159528140.png)

Almost done. All cache servers are now connected to the new system.
Going by the stats, the cache servers were starved quite bad.

Imports have been suspended for the next few hours to see how the system behaves under real load.

 No.17916

File: 1646695121259.png (1.74 MB, 1520x1119, That_Fucking_Bird.png)

Looks like things are now stable, but I will have to monitor the whole thing for a few more days to be sure.
In other news, imports are back online. Tomorrow, at some point in time, there will be an unannounced ~30min downtime.
I need to adjust the cache servers and from there, possibly ~24h of varying download speeds. Will also serve as a stress test for the data server.

 No.17938

File: 1646773858697.jpg (95.74 KB, 726x520, monster_girls.jpg)

Excluding the screaming status page and me not being able to ssh into the data server, I can say with confidence that we are going at MAX SPEED right now.
Well, it will ease up at some point for sure… a few hours should do it, I think. Either way, expect some slowdowns. In ~24-48h it should calm down completely.

 No.18054

File: 1647042560826.jpeg (333.65 KB, 708x1000, b4edc92a7a382eab3df4214ca….jpeg)

With the whopping 14 cache servers we have now (about 14 gbit of total uplink) the download speeds should be above the expectations you had with kemono, no more 429 and no more crazy slow download speeds.

I myself tested it out and it was going at a steady 10-20mb/s per download.

If you do experience any slowdowns be sure to tell us here >>14747 with a link to the content that was slow.

 No.18298

File: 1647723801749.jpg (130.55 KB, 1280x938, photo_2022-03-12_20-58-05.jpg)

Changed the filtering rules for the websites, gallery-dl should no longer trigger the anti ddos protection and work as expected.

 No.18654

File: 1648442921880.jpg (25.28 KB, 800x480, suggestion-box-improve-bus….jpg)

Have ideas for what could be next in regards to the importer? Suggest platforms that could be implemented into the importer in the following form.
This isn't a guarantee that they will be implemented. This is only a way to gauge interest for what MIGHT be implemented in the future.

Link: https://forms.gle/DQSfhJMG4AmZLmpAA

 No.18697

File: 1648497359366.webm (1.16 MB, 720x720, fd7bd8cf8accb503fdc045e43….webm)

This is a forwarded reply from >>18576

I don't have anything to directly add to a certain discussion in here, mostly because it is a mess of random arguments, public displays of mental illness, and the seeking of answers that were already provided several times over. So, instead, let's sync up.
>"Patreon?"
The importer is being fixed. It is broken for complicated reasons that are not impossible, but certainly not trivial to resolve. I promise. I believe it is the foundational core to the entire project, and will not leave it in disrepair longer than is needed.
Moreso, new archiver designs intended to replace Kitsune are being drafted, to the aim of preventing things like this from happening and making the reverse engineering process more streamlined, getting (you) more content from across the paywalled web faster.
>"Requests?"
The maintenance of most "community"-related things became the jobs of other team members months ago, but I stand by their current decisions.
>"Uploader?"
uploading =/= importing, and lack of content updates has absolutely nothing to do with the former
In general, there are multiple development-related reasons for why modtools and the fix never happened, and none of them are very important for the general public to know.
What you should know is that Kemono v3.0 is being worked on, with manual sharing and cloud drive snatching prioritized. Get used to the beta UI, by the way. Some form of it will be fully adopted soon-ish.

 No.20232

File: 1651032043721.png (912.95 KB, 730x754, koshiandoh.png)

The Patreon importer is now operational. Thank you for your patience.
Related post: >>20228

 No.22420

File: 1657900383738.png (750.91 KB, 1247x879, cc2aa2c66d85e4bcccd212bc47….png)

Cache maintenance is complete. For the next 24 hours things will be sluggish, but should ramp up in speed afterwards.

 No.26053

File: 1670868718179.jpg (1 MB, 4000x4000, 903ab3f9df9b7d314deeb5820a….jpg)

Cache maintenance is complete and we are finally reaching normal levels of traffic throughput.
As for connection limits, these will be adjusted at a later date based on current ingress/egress ratios.

 No.26644

File: 1672429920852.png (533.29 KB, 571x747, 9901d9b578d0276a265811510a….png)

With the delays and all, the patreon importer is back in full swing and should import much faster than before. There are still bugs to be ironed out, but 95% should work just fine.
Manual imports will always be processed first before auto-import.
Fantia is currently in a half working state due to interesting rate-limiting policies on Fantias side. If it fails, retry till it works(we'll automate it shortly). We'll see if things can be fixed in a reasonable way.

 No.27346

File: 1674755472473.gif (2.02 MB, 408x229, oeKn.gif)

ALL importing is halted for the next 3-6 hours while I work on the servers.
Submitting of keys still works, file access and downloading should be functional throughout the whole process.

 No.27350

File: 1674772101908.jpg (62.55 KB, 791x1024, media_Bih0I5WCAAEQ2_8.jpg)

Imports are back up and running. Fantia and boosty are halted till patches are available.
Server maintenance is mostly finished, what is left is nothing that you will notice.

 No.27468

File: 1675261491658.png (423.27 KB, 1024x336, OHM2013-9.png)

I guess the firewall was activated

 No.27743

File: 1676380584104.jpg (409.6 KB, 3200x1216, 9bda0715e1835b54ec631a4eb1….jpg)

Due to bandwidth shenanigans, we've had to temporarily change how files are displayed. This will only last for a short while. Names will be replaced by their hash for the time being until everything is sorted. Everything should still work as normal though.

Sorry for the inconvenience.

 No.29197

File: 1683153777392.png (237.78 KB, 1200x690, 5b48f0e779095523fc9234e199….png)

The file names are back and should be saving with their original names. There might be some speed degradation for eastern side of the globe, we'll be working on that over the next weeks. Fantia will stay down.

 No.34501

File: 1702501689587.jpg (85.99 KB, 640x565, GA6bb2UakAAxEnv.jpg)

Due to unexpected network issues for the past two weeks, we'll be moving most of the core components to other hosters.
For the duration of the whole migration no archivers will be running, unless you want the frontend to go down.
Which I know you want desperately, but I won't give you the gratification of causing that fallout.

In the meantime, touch grass.

 No.35208

File: 1703736873868.jpg (56.19 KB, 876x876, 982cae352c2d7673998ebc46d9….jpg)

Did you touch grass? Did you talk to your family? If not, now is the time. Do it. Socialize.

In other news, most components have been moved and are being monitored for irregularities.
I think you've experienced them not too long ago. Either way, there will be adjustments along the way.
Ignore status for now, it needs to be connected to the new network and I am just a bit lazy about that.
Other than that… relax, ventilate your goon cave, cleanup your room, change your bed sheets and get ready for another year of suffering.

Because the ride never ends.

 No.35478

File: 1704339093424.png (361.92 KB, 596x952, 1704068212413118.png)

The Database Server does not behave as I expect it to, don't like it at all. Imports will be paused for now.
Databases are replicated and backed up to multiple locations in real time. Websites will be up in 1-2 hours.

We'll be monitoring the system state, but it is hard to pinpoint what is causing the issues.
There is a slight inkling to what it might be, but it's too vague.
No "Permanent errors" yet, but if one were to occur, it would be a blessing right now.

Well, there is only one thing to do now. Test every component one by one.
Might as well order different composition of hardware to exclude HW/FW issues.

 No.35620

File: 1704603081182.jpg (12.14 KB, 383x92, what.jpg)

tl;dr moving databases once more

You know, the previous DB server has been running for over two years with no restarts, no crashes, no nothing except a whole ass network department full of retards fucking shit up.
Followed by weeks long back and forth which involved every NOC and DC contact in between, which was actively ignored by the network retards. Followed up with the higher ups being notified about the mentally challenged department of theirs.
In matter of minutes after that mail the network was fixed, it even came with a love letter that reads "Vacate the account and servers in 30 days. :D". Everything was fucking daijōbu.

Queue the new server, setup ZFS/VMs/DBs/Monitoring, all bells and whistles. Connect all servers back together and run synthetic load on the databases.
All is good and the webservers are talking to the newly migrated databases. Imports are running, everything is fine.
>Notice Input/Output error on a single block.
>table index, kinda fine, regenerate.
>Wait half a day
>Input/Output error on two more blocks
>It's an actual table this time
Now this makes less and less sense, check host storage, checksum errors keep accumulating. NVMEs s.m.a.r.t. reports 0 issues.
>Start checking everything up and down, start testing memory a second time
Check IMPI logs:
>Reading that triggered event 1.73V
>RAM at 1.73V
<what.jpg
After a long memtest, no memory issues. Boot back into the system, move everything from VMs closer to the host system.
Mempage issues gone, checksum errors are no more, did the reboot do good? Let it run overnight, checksum issues are back.
Possibly shitty WD GEN4 NVMe drives, seen some issue threads on the zfs github, maybe not, who knows at this point. But not that many issues from that point on.
HAH!
>watchdog: BUG: soft lockup - CPU# stuck for 5912s!

This FUCKING system. Fucking shoot me, end me right now. So… we are moving DBs once more. This thing can go to hell, I do not want to deal with recovery because of some shit hardware issues. Either way, moving servers once more.



PS: Oh look, website throws 503, THE FUCKING VM DIED AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

 No.35641

File: 1704672233280.png (486.22 KB, 590x427, hehehehehe.PNG)

Didn't even take 2 hours to do the migration and 60% of that time was spend copying data over network. Migration is over, DBs are safe and sound. And the system behaves. Amen.

 No.37660

File: 1709947546204.png (2.56 MB, 1536x1024, 1709916879443023.png)

DNS issues have been resolved, the propagation issues remain. Not a thing we have control over.
Root Zone TTL is 3 days and some ISP DNS servers like to do 14 day TTL.
Drop your system and/or browser DNS cache, most DNS servers should have the most recent configuration by now.



[Return][Go to top] [Catalog] [Post a Reply]
Delete Post [ ]
[ b / kemono / coomer ]